For years, the question of blockchain scalability has been debated in developer communities. Public blockchain networks, like Ethereum, require several nodes to validate transactions, limiting their ability to scale.
Ethereum, for instance, can process around 10-13 transactions per second. This pales in comparison to centralized systems like VISA, capable of handling up to 24,000 TPS.
If blockchains—and decentralized applications running on them—are to enjoy mass adoption, population-level scalability is necessary.
In addition to layer 2 blockchains, sharding is a proposed solution for scaling Ethereum to support more users. The idea of sharding is to break up the main blockchain into separate segments, so nodes only need to verify a subset of transactions.
With nodes validating transactions in parallel, network throughput can increase, and dApps can scale to meet the needs of a growing number of users.
A common technique in centralized database management, database sharding, is the process of dividing a large database into smaller chunks ("shards") to improve efficiency and application scalability by distributing a database across several machines in parallel.
As the number of users or operations executed on a software increases, so does the data stored in a software's database. An overloaded database will affect app performance and harm user experience. Thus, sharding is necessary to relieve databases and improve load times.
Database Sharding Example
Here's an illustration to explain database sharding:
Imagine there's a database containing personal records for 100,000 residents in a city.
Finding information for an individual would require computing around 100,000 transactions—a costly and time-intensive undertaking.
But what happens if we partition this large database into smaller databases?
For example, by grouping all city residents with surnames starting with specific letters on a unique server, finding information requires less computational resources, tasks require less time to complete, and the database becomes easier to manage.
What is a Shard?
A "shard" means a "small part of the whole." In database management, a shard is a subset of a large database hosted on a separate server. While each shard contains chunks of data, they all form one logical dataset.
Using our previous example, we could have on shard, "Shard 1," for city residents with surnames starting with 'A', "Shard 2," for those with surnames starting with 'B', and so on.
If you combine these logical shards, you'd get a single dataset of records for all city residents.
Sharding in blockchain networks follow the same process as with centralized databases, where a blockchain network can be “sharded” or split into distinct segments where each shard stores a portion of the blockchain’s data and processes a unique set of transactions.
With sharding, blockchain networks can improve network latency and scalability.
What problem is sharding in blockchain networks aiming to solve?
Because all nodes must reach consensus (i.e. agree) on transaction validity, blockchain networks can only process a small number of transactions at the same time.
Typically, every node stores the blockchain's entire history and processes every transaction. This is what makes blockchain networks like Ethereum and Bitcoin "decentralized."
With every full node owning a copy of the network's complete history, it becomes harder for malicious actors to hijack the network and potentially reverse or rewrite transactions.
Ensuring blockchain decentralization and security comes at the cost of scalability, though.
Sharded blockchains allow nodes to forgo downloading the full history of the blockchain or validate every transaction passing through the network, which increases network efficiency and enables blockchains to scale support for greater user demand.
What is a shard chain?
In the context of blockchain networks, a shard chain would contain a portion of the data and handle a portion of the transaction processing responsibilities.
Shard chains are like a collection of mini-blockchains that operate independently, and to preserve security, each shard chain submits a record of transactions to the main chain (Beacon Chain) at regular intervals through the Validator Manager Contract (VMC).
Because each shard chain will have a unique transaction history and a set of nodes to validate new transactions, multiple shard chains can run simultaneously to bolster network latency and throughput through parallel processing.
Ethereum is planning to adopt sharding as a scaling solution after their Ethereum PoS upgrades, which are a series of upgrades designed to improve the functionality of Ethereum 1.0.
Why is sharding necessary?
There are two main problems that necessitate sharding on Ethereum: the ability to support an exponential increase in users, and the need to remain decentralized at scale.
1. Support an increasing number of users
Ethereum’s present structure makes it unable to handle exponential increases in usage.
Currently, all Ethereum nodes store the complete state of the Ethereum Virtual Machine (EVM), including smart contract code and account balances.
Moreover, transactions are executed linearly and require confirmation by the entire network.
Transacting linearly and requiring nodes to manage large sets of data slows do the network.
2. Maintain decentralization at scale
Requiring nodes to keep a full copy of the blockchain also creates centralization issues. Already, the Ethereum ledger takes up 10+ terabytes of storage space, which is 10x what the average computer can hold.
As the Ethereum blockchain keeps growing, running an Ethereum node may become difficult, leaving only a few nodes in charge of securing the network. This re-introduces centralization and single-point-of-failure problems Ethereum was designed to solve, reducing its value.
Sharding can solve both problems.
Sharding promotes better scalability since nodes can validate different transactions simultaneously, and dividing transactional data into smaller chunks makes it easier to run a full node, which decreases the risk of centralization.
What is Ethereum's Casper protocol upgrade?
Casper is a protocol upgrade that will introduce proof-of-stake consensus to the Ethereum blockchain. Ethereum currently uses a proof-of-work consensus algorithm which requires nodes to solve cryptographic puzzles before adding new transactions to the chain.
While proof-of-work encourages decentralization and security, validating transactions requires a lot of energy and reduces processing speeds, which can have adverse effects on the environment and the web3 user experience.
With the Casper upgrade, Ethereum will switch from proof-of-work to proof-of-stake. Instead of competing to add new transactions, validators will stake ether in a smart contract for the right to propose new blocks.
In a PoS Ethereum, the main chain (Beacon Chain) will serve as the consensus and coordination layer for shard chains. All shard chains are tightly coupled with the main chain, strengthening the security of the system. Shard blocks can only be valid if approved by the main chain.
If sharding is adopted, validators (notaries) will be randomly assigned to individual shards to vote on the validity of collations. For a collation to get added to the main chain, it must receive confirmation from at least two-thirds of the notaries on the shard.
Before we explain how sharding works, here are some important definitions:
State refers to the information about a system at any point in time. In Ethereum, state is a description of the network at a particular time—contract code, accounts, address balances, etc. Every new transaction alters Ethereum's state.
A Merkle tree or root is a cryptographic mechanism that stores large amounts of information via hashes. Merkle trees/roots are essential for Ethereum's security, as they allow nodes to quickly verify if a piece of data is part of the larger structure.
A collation is a group of transactions conducted on a shard chain, similar to a block in proof-of-work (PoW). Collations are submitted to the main chain and linked together to form the blockchain.
The collation header is similar to a block header in proof-of-work consensus. A collation header contains metadata about the information inside the collation such as:
- The single shard that the collation belongs to
- The root hash of the parent collation
- The Merkle root of all transactions in a collation
- The pre-state root and post-state root
- Signatures of notaries
Notaries are validators randomly assigned to a shard chain to vote on proposed collations. These votes are called "attestations" and prove collation validity. Every collation needs at least ⅔ of collators to sign off on it before being added to the consensus chain.
A proposer is a collator (or validator) selected to create a collation and propose it for validation. The proposer has the same duties as a miner in PoW blockchains.
A committee is a collection of validators or notaries that attest the validity of shard blocks. These committees are randomly shuffled at intervals, so validators cannot predict which committee(s) they'll be in.
The Ethereum sharding upgrade will split the Ethereum blockchain into 64 shard chains. Every shard chain will have an independent state, meaning nodes will store a subset of account balances, smart contract code, and process a portion of the total transactions.
How might Ethereum PoS sharding work in practice?
Imagine Ethereum has 10,000 validators and 100 shard chains.
Through a pseudorandom protocol, eligible validators, who have deposited ETH in the Validator Manager Contract (VMC), and are assigned to shards 1-100.
In Shard 1, a validator (proposer) is selected to group new transactions into a collation.
Other validators (notaries) download the collation and verify the validity of transactions.
If two-thirds of notaries attest to the collation, it is submitted to the main chain via the VMC.
It's important to note that the entire collation isn't added to the Beacon Chain—it would be difficult and time-wasting to verify collations from every shard.
Instead, the validator nodes on the main chain simply check the attestations (signatures) for each collation to determine its validity.
Thanks to collation headers, anyone can verify the activity on each shard.
Collation headers function as "cross-links" and descriptions of the state and transactions on different shards. Thus, cross-shard communication makes it possible to have a top-level view of the Ethereum network without being part of every shard.
While ETH2 sharding promises many benefits, it introduces a new set of problems:
- Less nodes running each shard, malicious activity, like a 51% attacks become easier.
- More complex code, risk of smart contract security vulnerabilities increase.
- Committee members can collude to submit malicious transactions to the main chain.
ETH2 Sharding Protocol Security Measures
Fortunately, the Ethereum sharding roadmap has several mechanisms to improve the security of sharding protocols.
The first of these security protocols involves using fraud proofs to verify transactions on shard chains. As Vitalik Buterin explains, fraud proofs can be used to prove the validity of transactions and punish dishonest activity.
The second mechanism uses random sampling to prevent collusion. If validators don't know which shard they'll end up in, then it becomes harder to coordinate an attack on the system. In this scenario, a 51% attack on a shard chain becomes impossible to execute.
Discussions around sharding have been ongoing in the Ethereum community since at least 2013, but developers have postponed implementing it—for good reason. Sharding is highly complex and introduces new risks, so rigorous testing is required to work out any kinks.
According to Ethereum.org, sharding will roll out on Ethereum after "The Merge" must have taken place. For context, the Merge refers to the event where the PoW Ethereum main network (mainnet) integrates with the Beacon Chain.
The Beacon Chain is an implementation of the Casper proof-of-stake system and produces the randomness required to create a functional sharding system. This chain went live on December 1, 2020.
In the next section, we give a brief overview of the implementation of sharding in Ethereum:
Discussions around sharding have been ongoing in the Ethereum community since 2013, but Ethereum developers have postponed implementing sharding because it is highly complex and introduces new risks, which requires rigorous testing to launch successfully.
Here is a brief overview of the Ethereum sharding timeline:
What is the ETH 2.0 sharding timeline?
According to Ethereum.org, sharding will be deployed on Ethereum after "The Merge," or when the PoW Ethereum main network (mainnet) integrates with the Beacon Chain, has taken place.
Sharding Phase 1
This phase is likely to kick off by 2023, Ethereum's planned upgrade timeline. However, there are no specific dates outlined for the sharding timeline yet.
Here’s an overview of what the first phase of Ethereum sharding may look like:
- Validator Manager Contract (VMC) hosted on the Beacon Chain responsible for coordinating the sharding process
- Prospective ETH2 validators are required to lock up 32 ETH into the smart contract before getting added to the pool of eligible validators
- VMC assigns validators to shards at intervals to validate and process transaction collations to the consensus chain
- Shards only serve as "data depots" to increase data processing abilities of the Ethereum network.
Sharding Phase 2
The second phase of the ETH PoS sharding upgrade is less defined, as developers debate some aspects. However, we can expect Ethereum sharding phase 2 to look like this in practice:
- Shards move from being data layers to code execution layers—each shard has an independent “state”, (i.e. a unique set of smart contracts, account balances, and addresses.)
- Each shard acts like the Ethereum Mainnet with full smart contract and dApp support
- Cross-shard communication allows users on different shard chains to exchange value.
- dApps running on different shard chains can "talk" and interact with each other using cross-shard communication, improving Ethereum's scalability functionality.
With multiple shard chains running simultaneously, nodes can increase their transaction processing capacity and are able to process greater amounts of data on-chain.
Continuing the previous example, if 100 shard chains are processing 100 transactions per second, then Ethereum 2.0 will be able to achieve 10,000 TPS.
The only information from shards published on the base layer chain are collation headers—cryptographic proofs of validity—so it becomes easier for validator nodes to confirm transactions and commit to the consensus layer chain.
The result is faster transaction finality and higher network latency.
While estimates vary, the introduction of sharding is expected to scale Ethereum to handle hundreds of thousands of transactions per second.
With higher TPS rates, Ethereum can provide the scalability that dApps need to handle spikes in usage and billions of users.