The book by Daniel Drescher, titled, Blockchain Basics, is a great book for a newbie like me, as I could understand the building blocks of Blockchain using simple analogies and metaphors. This blogpost summarizes some of the main points in the book and included a mindmap for each chapter.

The author says that the book addresses the gap between purely technical and purely application oriented books on Blockchain. By providing a conceptual understanding of the technology behind Blockchain, the author is targeting those who would like to get an introduction to the technical concepts behind Blockchain using four key didactic elements

  1. Conversational style
  2. No mathematics and no formulas
  3. Incremental steps through the problem domain
  4. Use of metaphors and analogies

Thinking in Layers and Aspects

A software system can be partitioned in two ways, based on layers and aspects:

  • Application vs. implementation layers
  • Functions vs. nonfunctional aspects

By thinking of mobile phone as a system, one can look at each of the above components as a 2 by 2 matrix. In the context of this table, one can look at the integrity aspect of the system. There are three major components of data integrity:

  • Data Integrity
  • Behavioral Integrity
  • Security

The way to understand a system in this reductionist form is useful as it helps one focus on the essentials

Seeing the Big Picture

There are two main ways to design implementation of any system, centralized and decentralized. There are advantages and disadvantages for either of the systems. One can also design a hybrid architecture that picks and chooses a features from both the architectures. The area where Blockchain shines is in ensuring a specific nonfunctional aspect of a distributed software system: achieving and maintaining its integrity

Recognizing the Potential

Distributed P2P architectures have massive potential to disrupt any industry where there are middlemen who are facilitating immaterial goods and services between buyers and suppliers. One of the biggest challenges in distributed P2P system is integrity of the entire system and this is where Blockchain comes in. The former uses the latter as a TOOL to achieve and maintain integrity.

Discovering the Core Problem

The core problem that Blockchain is trying to solve is to ensure trust and integrity in a distributed P2P system. Indeed it is a non-functional layer of the system but ultimately that drives trust in the system from a diverse set of nodes

Disambiguating the Term

There are many ways that people use the word Blockchain. Some might refer to it as a data structure, some might call it an algorithm, some might refer to suite of technologies a Blockchain and some might use it as a umbrella term that manages distributed ledgers using set of Web 3.0 technologies

The author gives the following as a provisional definition

The Blockchain is a purely distributed P2P system of ledgers that utilizes a software unit that consist of an algorithm, which negotiates the informational content of ordered and connected blocks of data together with cryptographic and security technologies in order to achieve and maintain its integrity.

Proving and managing ownership is a billion dollar industry and Blockchain has a big role to play in the times to come.

Understanding the Nature of Ownership

The Blockchain is a purely distributed P2P system of ledgers that utilizes a software unit that consist of an algorithm, which negotiates the informational content of ordered and connected blocks of data together with cryptographic and security technologies in order to achieve and maintain its integrity.This chapter delves in to the heart of ownership that has two main components - proof of ownership and use of ownership. The use of ownership in turn is composed of identification, authentication and authorization. the proof of ownership related to mapping owners to property and inevitably this will involve a ledger. However having a single ledger as a source of truth is risky and it can be tampered, lost. Blockchain works by replicating the ledger at each node and the Blockchain algo ensures that the individual nodes collectively arrive at one consistent version of the state of ownership. Integrity in this system is its ability to make true statements about the ownership.

Spending Money Twice

Unlike a physical currency, a transfer between two people is actually a physical transfer, the problem of double spend usually arises in the case of digital assets where one copy digital goods and transfer it twice. This could particularly be a problem for a distributed ledger technology. Blockchain is a means to solve the double spending problem

Planning the Blockchain

This chapter lays down the various concepts that one needs to understand the distributed ledger accounting is made possible via Blockchain. The seven broad steps are

  1. Describing Ownership
  2. Protecting Ownership
  3. Storing transaction data
  4. Preparing ledgers to be distributed in an untrustworthy environment
  5. Distribution of ledgers
  6. Adding new transaction to ledgers
  7. Deciding which ledgers represent the truth

Documenting Ownership

The metaphor used by the author is that of a relay race where runners pass on the baton to describe Blockchain transactions. In order to prove ownership, there are usually two types of systems - inventory data oriented systems and transaction data oriented systems. The latter is useful to establish the ownership and the evidence of ownership. Blockchain follows the latter system and emphasizes the importance of ordering the transactions so that they can be aggregated to get to the inventory levels for each account. Also the Blockchain algo has to ensure that the transactions represented in the ledger are

  • formally correct
  • semantically correct
  • authorized appropriately

In a Blockchain transaction, one comes across the following components

  • an identifier of the account that is to hand off ownership to another account
  • an identifier of the account that is to receive ownership
  • the amount of goods to be transferred
  • the time the transaction is to be done
  • a fee to be paid to the system for executing the transaction
  • a proof that the owner of the account that hands off ownership indeed agrees with that transfer

Hashing Data

Hashing functions are the backbone of providing security on a Blockchain. Given a piece of data, there are are many ways to hash such as,

  • Independent
  • Repeated
  • Combined
  • Sequential
  • Hierarchical

Hashing in the Real world

One can think of hash references as cloak room tickets that refer to other cloak room tickets. Each cloak room ticket refers to the coat hook rather. There are many ways to use hash references to store a bunch of data; the two popular ways are chain based and tree based. Hash functions are also used to create computational puzzle that nodes can use to challenge oneself. This is in fact a key component of the Blockchain world where mining refers to solving computational puzzle that involves guessing the nonce value so that the resultant hash meets the requirement on the # of leading zeros to be present in front of the hash

Protecting User Accounts

A good analogy to understanding cryptography is a physical mailbox. Anyone can put in letters in to it but only the owner has access to the mailbox contents. In a similar way, Blockchain heavily uses asymmetric cryptography in two specific ways

  • Identifying accounts: Flow of info is via public to private key
  • Authorizing transaction data: Flow of info is via private key to public key

One thing to note is that public and private key are terms used to make the understanding clear. The right terminology to use is key pair in which one key can be used to encryption and the other key can be used for decryption.

Authorizing Transactions

The basic idea of digital signatures is as follows:

  1. You send in your message
  2. You generate a hash value of the message with your key. Encrypt the hash value with your private key and append the encrypted value to the message
  3. The receiver will generate the hash value of the message and compare it with the decrypted hash value. If they match it means the message has not been tampered

Storing Transaction data

The best way to understand Blockchain data structure is by imagining a transformation of a physical book in to a structure that has the following features:

  • Content is stored on separate pages and the content page is identified by a unique reference identifier
  • Ordering of the pages is done by maintaining the current and previous page reference on each page
  • Pages are loosely connected via reference numbers
  • Browsing direction is backward

The Blockchain data structure is a specific kind of data structure that is made up of ordered units called blocks. Each block of the Blockchain structure contains a block header and a Merkle tree that contains the transaction data.

One can imagine the ordered chain of block headers as being the digital equivalent to an old-fashioned library card catalog, where the individual catalog cards are sorted according to the order in which they are added to the catalog

Having each block header referencing its preceding block header preserves the order of the individual block header and blocks. Each block header is identified by its cryptographic hash value that contains a hash reference to its preceding block header and a hash reference to the application-specific data whose order it maintains. The hash reference to the application-specific data is typically the root of a Merkle tree that maintains hash references to the application-specific data

Using the Data Store

Changing the data somewhere in the Blockchain requires that all the subsequent blocks following the block needs to be changed. This requires a very large computational power and hence Blockchain transactions are considered immutable. It is an append only data structure where making any edits to the structure is resource intensive.

The steps performed in order to add new transaction data are

  • Create a new Merkle tree that contains all new transaction data to be added
  • Create a new block header that contains both a hash reference to its preceding header and the root of the Merkle tree that contains the new transaction data
  • Create a hash reference to the new block header which is now the current head of the Blockchain data structure

Protecting the Data Store

The Blockchain protects the history of transaction data from manipulation and forgery by storing transaction data in an immutable data store.

The history of transaction data is made immutable by utilizing two ideas:

  • Storing the transaction data in the change-sensitive Blockchain data structure, which when being changed requires rewriting the data structure starting at the point that causes the change until the head of the whole chain
  • Requiring the solution of the hash puzzle for writing, rewriting, or adding every single block header in the Blockchain data structure

The block header contains at least the following data:

  • hash reference to the header of its preceding block
  • root of the merkle tree
  • difficult of its hash puzzle
  • time when solving the hash puzzle was started
  • nonce that solves the hash puzzle

The chapter has a nice illustration that gives an overview of solving the mining puzzle. I had always thought that it was some complex mathematical puzzle that will require considerable brain power. Turns out that all it takes is sufficient time and computing power to crack the puzzle that doesn’t involve anything more than checking a set of numbers that satisfy a condition.

Distributing the data store among Peers

Immutable data structure is useful but unless it is replicated, there will always be a risk of single point of failure. Having the data structure distributed entails that the system has a mechanism to communicate the changes in the data structure with all the nodes.

The Blockchain communicates over the network in the following way:

  • Messages are sent in a gossip style. Every node that receives a message will forward it to the peers it communicates with, which in turn will handle the message in the same way
  • Duplicates of transactions or blocks are identified and filtered out based on their cryptographic hash values
  • Each node can order the received information because transaction data and block headers contain time stamps

Verifying and Adding Transactions

A metaphor to identify the concepts relating to verifying and adding transaction is - Think of a outsourcing company that has been given the task of marking the answer sheet, given the solution sheets. The rules are set up in such a way that it incentivizes the employees to give their best. However there is a chance that evaluators can collude and create a undesirable outcome.

The lesson learned from this metaphor is that the combination of reward, punishment, peer pressure, and competition can be used to manage a group of independently acting individuals as long as they do not collectively counteract.

The main components of the transaction reporting system are

  • Validation rules
  • Reward
  • Punishment
  • Competition
  • Peer Control

Choosing a Transaction History

There are two ways to choose transaction history, longest chain or heaviest chain. The distributed consensus has the following consequences

  • Orphan blocks
  • Reclaimed reward
  • Clarifying ownership
  • Reprocessing of transactions
  • A growing common trunk
  • Eventual consistency
  • Robustness against manipulations

Paying for Integrity

The choice of payment currency for rewarding miners in a Blockchain network has impact on various aspects such as its openness, distributed nature, integrity. Hence whatever be the form of payment, it must have the following characteristics

  • Digital form
  • Widely accepted
  • Not subjected to capital restrictions
  • Stable value
  • Trustworthy
  • Decentralized

Bringing the pieces together

This chapter integrates all the concepts and metaphors introduced in the previous chapters

Concept Purpose Metaphor Used
Transaction Data Describing the transfer of ownership Bank Transfer form
Transaction History Providing the current state of ownership Course of a relay race
Cryptographic hash value Identifying any kind of data Human Finger prints
Asymmetric Cryptography Encrypting and Decrypting of data Public mailbox with lock
Digital Signature Stating agreement with the content of transaction data Handwritten signature
Hash reference Reference that becomes invalid once the data being referred to changes Cloakroom tickets that utilize hash values for identifying cloak hooks
Change Sensitive Data Storing data in a way that makes any manipulation stand out immediately Jackets that carry cloakroom tickets in their pockets
Hash Puzzle Imposing a computationally expensive task Opening a number lock by trial and error
Blockchain Data structure Storing tx data in a change-sensitive way and maintain their order A library with a card catalog
Immutability Making it impossible to change the history of tx data Attempt to establish a fake family history
Distributed P2P Network Sharing the transaction history among all nodes of a network Groups of independent witnesses
Message Passing Ensure that all nodes of the system eventually receive all information Gossip
Blockchain Algorithm Ensure that only valid tx data are added to Blockchain data structure Carrot and Stick approach to ruling contractors
Distributed Consensus Ensure that all nodes of the system use the identical history of tx data Beaten park in a park as a result of visitors voting with their feet
Compensation Giving nodes an incentive to maintain integrity Bakery that pays its employees with bread

Seeing the Limitations

Reinventing Blockchain

Using the Blockchain

Summarizing and Going Further

Takeaway

Given a complete noob in this space, I have understood the basic components of the Blockchain, thanks to the awesome presentation of metaphors and visuals from the book. Definitely worth reading if you are like me and just beginning to understand this field.