Achieving Consensus on the Internet Computer
The asynchronous finalization of the blockchain’s novel consensus protocol dey unbelievably fast — finality happen for under two seconds.
By Manu Drijvers, Engineering Manager (Consensus) | DFINITY
Security and reliability dey essential to the vision of the Internet Computer — the world first web-speed, internet-scale public blockchain, wey dey enable smart contracts to securely serve interactive web content directly into the browsers of end users. The Internet Computer dey extend the functionality of the public internet so e go fit host software, wey go make next-generation dapps and open internet services possible.
The Internet Computer dey powered by machines in data centers across the world wey dey communicate through the Internet Computer Protocol (ICP). By collaborating in this way, the machines form a virtual Internet Computer wey dey allow developers to write and deploy pieces of software called canister smart contracts for secure and reliable way. “Secure” mean sey the state of a canister go only change according to the rules of the canister and no fit dey tampered with, and “reliable” means wey the canister no fit suddenly stop dey run.
The Internet Computer go achieve these properties with a novel consensus protocol: the different machines, or replicas, must reach consensus on which inputs to process and in what order dem go process am, such that them go maintain a coherent state. Every piece of software dey executed by many machines around the world instead of just one; the majority of the machines together define the true state of the software. If some individual replicas report a tampered state, get connection issues, or are even malicious, this no go make a difference as long as the majority of the replicas correctly execute the software.
Additionally, we wan make the Internet Computer to scale, sey we fit run more and more decentralized applications in the form of canister smart contracts on the platform. To achieve this scalability requirement, we go split the canisters into smaller groups, and each group go run on wetin we call a subnet. A subnet dey powered by many different replicas around the world, and they all execute the canisters wey dey run on that subnet. Canisters on different subnets fit also communicate securely. We fit always add subnets to the Internet Computer, thereby growing its capacity.
Why consensus dey important
Any number of replicas fit dey unavailable or malicious without affecting the correct state of the subnet as long as the majority of machines dey function correctly. This approach to use replication to gain security dey require consensus protocol. The subnet must process different messages, namely messages from users to canisters and from canisters to canisters. And they must all process the same messages in the same order, so that all replicas go always end up in the same state — but each of the replicas wey power the subnet fit actually see the messages for different order. We dey use consensus protocol so that all of the nodes wey dey power subnets fit agree on the order of the messages to process.
We go reach consensus by using blockchain. The messages wey the subnet go process dey grouped together and placed in blocks, where each block points to previous block, and e go form a chain of blocks. The property of security dey achieved when all replicas agree on the blockchain, to give ordering of the messages to execute.
More precisely, we want wetin we call “safety” — that is, if two honest replicas think sey they agree on the blockchain up to a certain point, them go get the same view of the blockchain up to that point.
Secondly, we want “liveness,” wey mean sey the blockchain go dey grow and we dey agree on more and more blocks and continuously process additional messages.
Thirdly, we want wetin we call “validity,” sey all of the blocks and the messages in the blocks dey valid. This, for example, go ensures sey all messages in blocks are correctly signed by the sender.
The difficulty here be sey we want these three properties to hold even if some of the nodes wey dey power the subnet fit become faulty, go offline, or even be actively malicious. More precisely, we want all of these properties to hold as long as less than a third of the nodes in a subnet dey offline or malicious. In the examples below, we will go use four nodes, wey mean sey we go satisfy these properties if at most one of them dey malicious. At Genesis, the Internet Computer go use larger subnets wey fit tolerate more than one corrupt replica.
Chain Key Cryptography: The Scientific Breakthrough Behind the Internet Computer
Chain Key cryptography is a set of cryptographic protocols wey orchestrate the nodes wey make up the Internet…
We’re going to zoom into a single subnet and see how we use a blockchain to reach consensus, but e dey important to remember sey the Internet Computer consists of many subnets, and therefore many blockchains. We go present our consensus protocol by a building on top of a peer-to-peer gossip network. We go often talk sey the replica “sends this message out.” By that, we mean sey we go use our gossip network to exchange the message with the other replicas on the subnet. We also only go focus on the ordering of the messages. Some other parts of our protocol go take care of the processing of these messages.
E dey important to note sey we chose to design our consensus protocol wen we tailoring am to the needs of the Internet Computer. We dey try to optimize for throughput, latency, and also protocol simplicity. Our protocol dey contain four main parts:
First, we get block making. This part creates candidate blocks out of which we fit build blockchains.
Second, we get notarization. This part dey responsible for identifying valid blocks out of which we fit build valid blockchains.
Third, we get the random beacon. The random beacon go give us some randomness wey we fit use to further enhance our protocol.
Lastly, we get finalization, wey go tell us when we do actually reach agreement.
Block making
A replica on the subnet fit serve as the block maker, proposing a block to extend the chain. E go get some messages available go dey processed by the canisters wey dey run on this subnet. It fit get blockchain up to a certain height (e.g., 29). E dey gather messages wey dey wait to be processed, group them together into a block, and propose an extension to the blockchain by to send am on this gossip network to the other replicas.
E dey important to remember sey we want this protocol to work even if some participants dey misbehaving. This means sey we no fit elect one single block maker to extend the blockchain because this one block maker fit actually dey malicious and we fit dey stuck, wey go compromise our liveness. Therefore, we must get many replicas serving as block makers.
Notarization
For the same reason, these block proposals fit be invalid. To deal with that, we get a notarization process wey dey work in rounds, and in every round e dey ensure sey we get at least one valid block wey fit extend the blockchain.
As an example, make we say Replica 1 get notarized blockchain up to height 29. If e kon see a block extending the blockchain at height 30, e go validate that block. If Replica 1 see sey this block dey valid, e fit place a cryptographic signature on am wey we call the notarization share. We go send the notarization share to the other replicas and the subnets expressing that Replica 1 think sey na good block.
Maybe Replica 3 and Replica 4 fit also create notarization shares on that same block. Make we define sey three out of the four replicas be sufficient approval. Note sey in this instance sey three out of four na the highest percentage of approval we fit hope to get, where the protocol go make progress even if one of the notes dey misbehaving or offline. We combine these three notarization shares into a single artifact, wey we call the notarization, meaning block 30 dey notarized. The notaries go kon move on to the next round and start dey look for blocks at height 31.
For these notarization shares, we use special signatures we we call multi-signatures. Multi-signatures get nice property wey allows many signatures on the same message to fit compress into a single constant-size signature wey proves sey all of the nodes sign the message. This means sey even if we get a very large subnet with many notaries, the notarization will still be a small artifact.
Notarization no go always work as well as just described. The replica fit see a valid block and create a notarization share on that block. However, it fit kon see another candidate block at the same height, wey also valid. If the replica fit only sign one of the blocks, we go actually get stuck because some notaries fit support one block while others support another block, and neither go ever get enough approval. To satisfy the liveness property, the notary go actually sign both of the blocks, making sure sey at least one of them go become notarized this way. This means sey we may obtain multiple notarized blocks at one height.
Random beacon
The block maker and the notary fit identify valid blocks, but we neva not reach agreement yet because there fit be multiple notarized blocks at every height — so weti we get fit look like a tree of many valid blocks. In order to final achieve agreement on the blockchain, we go to reduce the number of notarized blocks wey we get every round by adding something to our protocol: the random beacon. At every height, we get a random-looking artifact that we call a random beacon, which is an unpredictable value.
replica fit get a random beacon at height 29 and when the notarization process for round 29 finishes, it decides that it is time to create the next random beacon. E go create a special signature on the previous random beacon value which we call a random beacon share. This na another artifact wey we share via the gossip network.
If we kon get another random beacon share, we fit combine the shares to construct the next random beacon value. We do this by using special signatures, namely threshold BLS signatures. They have the special property that they are unique, wey no matter which replicas participate to construct the value. However, they are also unpredictable because the replicas wey construct sey that value by themselves. These threshold BLS signatures require special key material where a secret key is shared among the parties, which we set up via a protocol called noninteractive distributed key generation.
Now wey we get this common randomness we go use it to rank the block makers every round.
For example, using the random beacon wey we created in round 23, we go rank the block makers in round 24. At round 24, we go agree sey Replica 1 na the top priority block maker (i.e., the rank 0 block maker). Of course we go still need to have fallbacks because Replica 1 fit not do this job properly. For example, Replica 4 fit be chosen as the rank 1 block maker, the first fallback. And if that no work, then Replica 3 na the rank 2 block maker. And finally, Replica 2 na the last resort. Since the random beacon offers common randomness, all the replicas agree on this ordering of the block makers.
We go use this block maker ranking to further enhance the notary behavior. More precisely, when a notary enter a round, e go starts a timer. At the beginning of the timed period, e dey only looking to create a notarization share for the block by the rank 0 block maker. Only if that one fail after a certain amount of time the notary go fall back to a block proposal from the rank 1 block maker. After another preset amount of time, e go fall back even further to the rank 2 or rank 3 block maker.
By having the notary take the block maker ranking into consideration, the goal be sey we reduce the number of notarized blocks wey we get every round. make we look at example: Replica 1 dey round 30 and dey receive a valid block proposal for height 30, but na only the rank 1 block maker. It is going to wait because e no dey willing to sign the rank 1 block yet. If things go well, e fit kon receive a block proposal from the rank 0 block maker. Now the notary go create a notarization share on this block, and not on the rank 1 block. If sufficiently many other notaries do the same, then only the rank 0 block go become fully notarized. In this scenario, we actually reduce the number of blocks wey become notarized. This go bring us closer to consensus.
Now we fit still get multiple notarized blocks in some rounds, but hopefully in most rounds, we get only one notarized block from the rank 0 block maker. In fact, if only one notary dey around, then we don already reach agreement. This na because a valid chain must exist of notarized blocks in every round and if only one candidate block, then every chain wey moves pass that point must go through that block. Therefore, the chain implied by that block must actually dey agreed upon.
Finalization
So the challenge remain: How we fit know sey we don reach reached agreement? How replicas fit know sey them fit accept the chain? One option go be for a replica to simply wait for some amount of time and assume sey after this time period e go don see all notarized blocks wey exist, and if na just one notarized block at a height, e considers the chain to be agreed upon up to this height.
But this na very risky approach. Perhaps the network neva dey functioning properly, and there are actually other notarized blocks that we simply have not seen yet. This means that if we do not wait long enough, we may violate the safety property. This puts us in a very difficult position; we want blocks to be agreed-upon quickly wey the the subnets dey responsive for the user, but at the same time, we know that wetin need to wait a long time to guarantee the safety property.
We go trade-off by using a different mechanism to observe agreement. We get a separate asynchronous finalization process wey go helps us detect when we have reached agreement. Remember sey notaries create notarization shares until they see sey one block is fully notarized, at which point they move on to the next round. Now we go have the notaries share some information about how many blocks they notarized, wey go help us reach agreement. More precisely, if the notary no create any notarization shares for blocks other than the first notarized block it received in that round, e go create a different type of signature we fit call the finalization share.
finalization share on a height-30 block from Replica 1 essentially means that Replica 1 did not notary-sign any height-30 block other than this one. This is another artifact that it will gossip to the rest of the subnets. If a sufficient number of replicas also create finalization shares for the same block, then we can aggregate them into a single finalization. Whenever we see such a finalization on the block, we know that we can trust the blockchain up to that point because a finalization is proof that no other notarized block at that height can exist. If the network behaves well, this means that we can actually reach agreement on blocks very quickly compared to traditional blockchains, because we can quickly create these finalization shares and reach agreement. With this asynchronous finalization approach, a block can be finalized in less than two seconds after being proposed.
In addition, we are not at risk of network attacks. Even if the network does not behave well, we know that we are still safe because we only rely on signatures and not on the assumption that we have seen all messages.
A full finalization mean sey other notarized blocks no fit exist as long as less than a third of the nodes of a subnet dey corrupt. make we go a little bit into the theory and see why that wetin actually the case in the scenario with four nodes. We know that if a block is finalized, it means sey three notaries created finalization shares on this block because we have four nodes. We assume sey most one of them is corrupt, o possible but that still means sey we know for sure that at least two replicas were honest and created a finalization share on the block. We also know sey honest replicas only create such a finalization share if they no sign any other block at that height.
That means sey that two replicas did not create notarization shares for any block other than the finalized one at this height. We also know sey a notarization requires three notarization shares. This means sey three replicas must participate to reach that, but we only have four replicas on the subnet. Additionally, talk that two out of them definitely did not contribute to notarizing any other block, which leaves only two replicas. And two no dey sufficient to reach this notarization threshold of three, thereby concluding the proof that finalization implies that no other notarized blocks at that height.
This dey under the assumption sey less than a third of the replicas dey corrupt. The above demonstration use a subnet size of four replicas with at most one that was corrupt, but you can easily extend the same argument to “n” replicas out of which “f” dey corrupt, as long as “f” dey less than a third of “n.”
In summary, we get a consensus protocol wey consists of four components. The block maker creates candidate blocks to extend the blockchain. We get notarization process wey fit ensures valid blocks dey identified. We and a random beacon wey we use to rank block makers and reduce the number of notarized blocks we get in every round. Lastly, wew use an asynchronous finalization mechanism wey let us know when a blockchain don actually agree upon having to rely on networking assumptions altogether.
This consensus protocol go allows us to use replication within a subnet, achieving the security and reliability that we want from the Internet Computer.
_____
Stay tuned for more releases detail the technologies behind the Internet Computer.