One of the hottest technologies of late is no doubt Blockchain. But what is a blockchain? A blockchain is a digital transaction of records that’s arranged in chunks of data called blocks. These blocks then link with one another through a cryptographic validation known as a hashing function. Linked together, these blocks form an unbroken chain — a blockchain. A blockchain is programmed to record not just financial transactions but virtually everything of value. Another name for blockchain is distributed ledger. In this blog post, I will explain the basic idea behind a blockchain and how it works.
Centralization vs. Decentralization
In the traditional client-server architecture, we store transactions records in a database, in a centralized location. Figure 1 shows the interactions between clients and servers.
Figure 1. Centralized data storage
However, storing your data in a central location has the following disadvantages:
- Potentials of data loss
- Threats to data integrity
The first point is easy to mitigate and can be solved by replicating the data in multiple locations (known as backups). The second point is more challenging. How do you ensure the integrity of the data stored on the server? A good example is banks. How can you be sure that the balance in your bank account reflects the correct amount of money you possess? In most countries, people trust their banks and governments to maintain the correct records of their personal possessions. But in some countries, the governments are corrupt and people have very little trust in them. Hence, a centralized approach to data storage in not always ideal.
Hence the birth of blockchain – the decentralization of data storage, commonly known as a distributed ledger. Using decentralization, Figure 1 will now look like Figure 2.
Figure 2. Decentralized data storage
Storing the same transactions on multiple computers ensures that no single computer can singlehandedly alter the data on its own, since the transactions are replicated on multiple computers. If a malicious actor wishes to alter the transactions, he must modify the transaction not only on a single computer, but all the computers holding the transactions. The more computers participating in the network, the more computers he needs to modify. Hence, in this case, decentralization shifts the trust from a central authority to one that is trustless – you don’t need to trust a central authority since now everyone holds the records.
The Blocks in Blockchain
In Figure 1 and 2, you saw that the database contains transactions. Typical transactions may look like this:
- A sends to B 5 BTC (Bitcoins)
- B sends to C 2 BTC
- A sends to D 1 BTC
It is important to note that blockchains are used not only for cryptocurrencies like Bitcoins and Ethers, but can be used for anything of value. Transactions are grouped into blocks (see Figure 3).
Figure 3. Transactions are grouped into blocks
Transactions are grouped into blocks so that they can be efficiently verified and then synchronized with other computers on the network. Once a block is verified, they are added to the last block in the blockchain, as shown in Figure 4.
Figure 4. Linking blocks to form a blockchain
Blockchain gets its name from the fact that blocks are chained to each other cryptographically. In order to ensure the correct order of transactions in the blockchain, each block contains the hash of the previous block, as shown in Figure 5.
Figure 5. Using hashing to chain the blocks in a blockchain
Storing the hash of the previous block assures the integrity of the transactions. Any modifications to the transaction(s) in a block will cause the hash in the next block to be invalidated, and it will also affect the subsequent blocks in the blockchain. If a hacker wants to modify a transaction, he not only needs to modify the transaction in a block, but all other blocks in the blockchain. In addition, he also needs to synchronize the changes to all other computers on the network, which is a computationally expensive task to do.
The first block in a blockchain is known as the genesis block. Each blockchain has its own genesis block – the Bitcoin network has its own genesis block, and likewise Ethereum also has its own genesis block.
Nodes in a Blockchain Network
I have mentioned earlier that in a decentralized network, there are many computers holding on to the transactions. So we can now replace the transactions with the blockchain, as shown in Figure 6.
Figure 6. Full nodes in the blockchain network containing the blockchain
Computers storing the blockchain are known as full nodes. They help to relay transactions and blocks to other nodes. They also make the network robust, as there are now multiple nodes in the network with little risk of a single point of failure.
Miners
Among all the full nodes in a blockchain network, some are known as miners. Miners add blocks to the blockchain. In order to add a block to the blockchain, a miner needs to do the following:
- Take the transactions in the previous block and combine it with the hash of the previous block to derive its hash.
- Store the derived hash into the current block
Figure 7 outlines the process:
Figure 7. Storing the hash of the current block in the next block
The process of performing hashing is straightforward, and a machine can perform that in a matter of milliseconds. So how do you ensure that all the miners have equal chance to mine a block? Turns out that to solve this problem, the blockchain network (such as Bitcoin or Ethereum) will insert a network difficulty target into every block, so that in order to mine a block, the result of the hash must meet the criteria set by the difficulty target. For example, the difficulty target may dictate that the resultant hash must start with 5 zeros, if not the block cannot be accepted. As more miners join the network, the network will automatically adjust the difficulty target so that blocks can be mined at a constant rate.
In order to meet the difficulty target, miners need to inject a number called nonce (number used once) into the block. So instead of deriving a hash from the transactions and the hash of the previous block, they now add the nonce to the hashing operation. What the miners need to do is to compete with each other to guess the value of the nonce that will give a resultant hash matching the difficulty target. And that’s basically what miners do! Their job is to find the value of the nonce.
The updated blockchain now looks like Figure 8.
Figure 8. Miners work hard to find the value of the nonce
The process of finding the nonce is called – Proof-of-Work (POW). Once the nonce is found, the entire block and the nonce is broadcasted to other nodes, informing them that the block has been mined and ready to be added to the blockchain. The other blocks can now verify that the nonce does indeed satisfy the difficulty target and stop their current mining process and move on to mine the next block. The key concept behind PoW is that it is difficult to find the nonce but easy for other to verify once you found it. A good analogy is a digital lock – it is difficult to find the correct key combination to unlock it but very easy to verify it once you have found the correct key combination.
When a miner has successfully mined a block, he earns mining fees as well as transaction fees.
Summary
Note that I have simplified the explanation of a blockchain and have not dived into the details of how transactions are stored in a blockchain and the role of merkle trees and merkle root. However, this should serve as a good introduction to blockchain and hopefully will clarify a lot of doubts about blockchain that you may have. If you want to learn more about blockchain and Smart Contract programming, I hope to see you soon at a NDC workshop near you!