Definition
A blockchain is a distributed, append-only database that is shared among the nodes of a computer network. It has no central authority, but rather relies on consensus between nodes to validate the chain of data as it gets built. The nodes all hold a copy of the data and can serve it independently.
Blockchains guarantee the fidelity and security of a record of data and generates trust without the need for a trusted third party (like a bank, a government, or a company).
Blockchains are best known for their crucial role in cryptocurrency systems, such as bitcoin (2009) or ethereum (2013), even though they were invented in 1991 for timestamping documents in a way that could not be tampered with.
Blocks
The internal data structure of a blockchain is an ordered linked list of blocks, which are immutable data containers.
Blocks contain a nonce, new transaction data, the previous block’s hash, and the hash of the previous pieces, concatenated.
Modifying the transactions (the data records) of a block will therefore invalidate not only its own hash, but all subsequent ones.
Once the block is approved through consensus algorithms, it is committed and written, and will forever be a part of the blockchain.
The hash function is public knowledge (usually, sha256), meaning anyone can validate the chain easily.
Consensus
Because blockchains are decentralized, there is not central authority that can be trusted to know that the next block (containing all new transactions) is valid. Therefore, they require a consensus algorithm to build confidence about this validity.
This is actually a hard problem that peer-to-peer networks have struggled with, and bitcoin solved it cleverly by using an algorithm called proof-of-work, originally invented in 1993 to combat spam emails or other denial of service type attacks.
The intuition of this algorithm is that all nodes compete to be the creator of the next block. They all have the transaction data to put in it, and a hard puzzle condition (create an output_hash
hash with multiple leading zeroes). They randomly try different nonces and compute:
output_hash = hash(nonce, data, previous_block_hash)
Finding the right nonce is hard, but verifying it is easy. Miners get rewarded for winning this game, incentivizing more miners, and more miners means more security to the network.
There are other consensus algorithms, the most promising of which is called proof-of-stake.