Vitalik Buterin et al Publish Paper on Fraud Proofs For Sharding and Light Clients


Vitalik Buterin, Ethereum’s co-founder, and two PhD students at UCL, Mustafa Al-Bassam and Alberto Sonnino, have published a paper describing fraud proofs so that “light clients can get near-full-node-equivalent assurances of block validity.”

Light clients are the eth wallets on your smartphone. They’re very light and place quite a bit of trust on miners as they do not themselves validate protocol rules.

They’ve worked fine so far, but the need for far higher security on sharding may now make light wallets more secure. The paper says:

“Our work also plays a key role in efforts to scale blockchains with sharding, as in a sharded system no single node in the network is expected to download and validate the state of all shards, and thus fraud proofs are necessary to detect invalid blocks from malicious shards.”

The highly technical, but somewhat accessible, paper describes a method by which nodes can validate a block, then publish proof of such validity. Light wallets or shards can then validate the proof and if anything is wrong they reject the block. There is however a problem. The paper says:

“A malicious block producer could prevent full nodes from generating fraud proofs by withholding the data needed to recompute dataRooti and only releasing the block header to the network.

The block producer could then only release the data—which may contain invalid transactions or state transitions—long after the block has been published, and make the block invalid. This would cause a rollback of transactions on the ledger of future blocks.”

The proof part is arguably somewhat easy, but this data withholding problem is pretty hard to the point some in Bitcoin Core have said fraud proofs are impossible. Buterin says:

“Essentially, erasure codes plus fraud proofs are used to convert a ‘100% data availability’ problem into a ‘75% data availability’ problem, which can more easily be solved with random sampling techniques.”

The paper describes this process in more depth and we quote to some extent the relevant part as it is fairly easy to follow:

“A block producer compiles a block of data consisting of k shares, extends the data to 2k shares using Reed-Solomon encoding, and computes a Merkle root (the dataRooti) over the extended data, where each leaf corresponds to one share.

When light clients receive a block header with this dataRooti, they randomly sample shares from the Merkle tree that dataRooti represents, and only accept a block once it has received all of the shares requested. If an adversarial block producer makes more than 50% of the shares unavailable to make the full data unrecoverable (recall in recalled in Section 2.3 that Reed-Solomon codes allow recovery of 2t shares from any t shares), there is a 50% chance that a client will randomly sample an unavailable share in the first draw, a 25% chance after two draws, a 12.5% chance after three draws, and so on, if they draw with replacement. (In the full scheme, they will draw without replacement, and so the probability will be even lower.)

Note that for this scheme to work, there must be enough light clients in the network sampling enough shares so that block producers will be required to release more than 50% of the shares in order to pass the sampling challenge of all light clients, and so that the full block can be recovered.”

Now that we have the full block, we can create the fraud proofs, light nodes can check it, and light nodes can in effect become pretty much a full node.

This has the assumption that at least one full node is honest, and obviously the more honest full nodes the better.

In the context of sharding, presumably that would mean one honest node per shard, which means nodes in sharding would be extremely important and would in a way translate to the more nodes, the more capacity.

They will be rewarded for the service with about 5% interest once Ethereum 2.0 launches, with it all in combination returning public blockchains closer to one CPU one vote for in Ethereum 2.0 nodes would have the meaning they had in the bitcoin whitepaper.

The reaction of Gregory Maxwell, former Blockstream CTO and current Bitcoin Core developer, was to state:

“Error coded anti-withholding been discussed many times– and I’ve been pretty bummed that I’ve been unable to excite people much about the idea.”

The paper does say: “There have been online discussions about how one may go about designing a fraud proof system, however no complete design that deals with all block invalidity cases and data availability has been proposed.”

The paper includes a fraud proofs prototype and a data availability prototype with this appearing to be a breakthrough of sorts as fraud proofs have long been argued as the primary bottleneck to scaling public blockchains.

That bottleneck now seems to have been addressed, with the resource usage consumption here being 14kb per 1MB block with validation occurring at about 1 second.

Fraud Proofs size, Sep 2018.
Fraud Proofs validation time, Sep 2018.

A light client has to do the above for each block, but they do not have to store the proofs. Once they check that the block is valid, they can disregard the 14kb, thus seemingly consuming no storage.

That means in this design a light node can automatically reject an invalid block, so a miner can’t fool them as a light node in this design would be upholding protocol rules, just as a full node.

If that is indeed the case, then full node storage and synching can become a non-problem, and scalability itself would practically be solved with all that’s left now just coding it all.



Leave a Reply

Notify of