Scalability has now emerged as a critical aspect of the technical dialogue within the cryptocurrency arena. The Bitcoin blockchain has surpassed 12 GB in size, necessitating several days for a new bitcoind node to completely synchronize; the UTXO set that needs to be kept in RAM is nearing 500 MB, and ongoing software enhancements in the source code are simply insufficient to counteract the trend. With each passing year, it becomes increasingly challenging for an average user to operate a fully functional Bitcoin node on their desktop, and despite the fact that the price, merchant acceptance, and popularity of Bitcoin have soared, the number of full nodes in the network has essentially remained unchanged since 2011. The 1 MB block size restriction currently imposes a theoretical limit on this expansion, but at a significant cost: the Bitcoin network is unable to process more than 7 transactions per second. If the popularity of Bitcoin increases tenfold once again, then this limit could drive the transaction fee up to nearly a dollar, rendering Bitcoin less effective than PayPal. If there is one challenge that a successful implementation of cryptocurrency 2.0 must resolve, it is precisely this.
The reason we in the cryptocurrency field are facing these issues and making minimal progress in finding a resolution is a singular foundational problem with all cryptocurrency models that requires attention. Of all the different proof of work, proof of stake, and reputational consensus-based blockchain designs proposed, not one has succeeded in overcoming the same fundamental issue: every full node is required to process each individual transaction. While creating nodes capable of processing every transaction, even potentially handling thousands of transactions per second, is feasible in centralized systems like PayPal, Mastercard, and banking servers, the challenge lies in the significant amount of resources needed to establish such a server. Thus, there exists no incentive for anyone besides a few large corporations to undertake this task. Once that scenario occurs, those few nodes may become susceptible to profit motives and regulatory influence, potentially making theoretically unauthorized modifications to the state, such as granting themselves free funds, while all other users, reliant on those centralized nodes for security, would lack any means to demonstrate that the block is invalid since they cannot process the entire block due to resource limitations.
In Ethereum, as of now, there are no fundamental advancements regarding the principle that every full node must process every transaction. Numerous ingenious concepts have been proposed by various Bitcoin developers involving multiple merge-mined chains with a protocol for transferring funds from one chain to another, and these ideas will comprise a significant portion of our cryptocurrency research endeavors; however, the investigation into how to optimally implement this is still in its early stages. Nonetheless, with the advent of Block Protocol 2.0 (BP2), we possess a protocol that, although it does not address the inherent blockchain scalability flaw, does move us partially in that direction: provided that at least one honest full node exists (and, for anti-spam purposes, possesses at least 0.01% of the mining power or ether ownership), “light clients” that only download a minimal amount of data from the blockchain can maintain equivalent security levels as full nodes.
What Is A Light Client?
The fundamental concept behind a light client is that, courtesy of a data structure found in Bitcoin (and, in a modified variant, Ethereum) referred to as a Merkle tree, it is feasible to construct a proof that a specific transaction exists within a block, such that the proof is considerably smaller than the block itself. Currently, a Bitcoin block is roughly 150 KB in size; a Merkle proof for a transaction is approximately half a kilobyte. If Bitcoin blocks grow to 2 GB in size, the proofs might extend to a full kilobyte. To create a proof, one simply needs to trace the “branch” of the tree from the transaction up to the root, providing the nodes along the way. Utilizing this mechanism, light clients can be assured that transactions directed to them (or originating from them) actually reached a block.
This significantly complicates matters for malicious miners attempting to deceive light clients. In a theoretical situation where operating a full node is entirely impractical for ordinary users, if a user claims to have sent 10 BTC to a merchant without the resources to download the complete block, the merchant would not be powerless; they could request a proof that the transaction sending 10 BTC to them is indeed included in the block. If the attacker is a miner, they might employ more sophisticated tactics and actually position such a transaction within a block but utilize funds (i.e., UTXO) that do not exist. Nevertheless, even in this scenario, a countermeasure exists: the light client can request an additional Merkle tree proof demonstrating that the funds being spent by the 10 BTC transaction are indeed valid, continuing down to a secure block depth. From the perspective of a miner using a light client, this evolves into a challenge-response protocol: full nodes verifying transactions that detect a transaction spending an output that does not exist can announce a “challenge” to the network, and other nodes (likely the miner of that block) would need to present a “response” consisting of a Merkle tree proof showing that the questioned outputs do actually exist in some preceding block. Nevertheless, there is one flaw in this protocol within Bitcoin: transaction fees. A malicious miner could release a block awarding themselves a 1000 BTC reward, and other miners operating light clients would lack any means of knowing this block is invalid without calculating all of the fees from each transaction themselves; they could assume that perhaps someone else was insane enough to actually contribute 975 BTC worth of fees.
BP2
With the earlier Block Protocol 1.0, Ethereum faced even greater challenges; there was no method for a light client to verify that the state tree of a block was a valid outcome of the parent state and the transaction list. In fact, the only way to obtain any assurances was for a node to individually process every transaction and sequentially apply them to the parent state. BP2, however, introduces stronger assurances. With BP2, each block now comprises three trees: a state tree, a transaction tree, and a stack trace tree providing the intermediate root of both the state tree and the transaction tree after each step. This facilitates a challenge-response protocol that, in simplified terms, operates as follows:
-
Miner M publishes block B. If the miner is indeed malicious, the block might incorrectly update the state at some point.
-
Light node L obtains block B, performs fundamental proof of work and structural validity assessments on the header. If these validations succeed, then L begins to regard the block as authentic, albeit unconfirmed.
-
Full node F acquires block B, and initiates a comprehensive verification procedure, applying each transaction to the parent state, ensuring that every intermediate state aligns with the intermediary state supplied by the miner. Suppose that F detects a discrepancy at point k. Then, F disseminates a “challenge” to the network comprising the hash of B and the value k.
-
L receives the challenge and temporarily marks B as unreliable.
-
If F’s assertion is erroneous, and the block is legitimate at that stage, then M can deliver a proof of localized consistency by exhibiting a Merkle tree proof of point k in the stack trace, point k+1 in the stack trace, and the subset of Merkle tree nodes in the state and transaction tree that underwent modification during the update from k to k+1. L can then authenticate the proof by taking M’s assurance on the legitimacy of the block up to point k, manually executing the update from k to k+1 (this involves processing a single transaction), and verifying that the root hashes correspond to what M provided at the conclusion. L would, of course, also confirm that the Merkle tree proof for the values at state k and k+1 is accurate.
-
If F’s assertion is accurate, then M would be unable to devise a response, and after a certain duration, L would entirely reject B.
It’s important to note that at present, the model is designed for transaction fees to be burned, rather than allocated to miners, therefore the vulnerability in Bitcoin’s light client protocol does not apply. Nevertheless, even if we opted for a change in this approach, the protocol can be effortlessly modified to accommodate that; the stack trace would simply maintain a running tally of transaction fees alongside the state and transaction list. As a deterrent to spam, for F’s challenge to be deemed valid, F must have either mined one of the last 10000 blocks or possessed 0.01% of the total ether supply for a specified duration. Should a full node issue a false challenge, indicating that a miner manages to effectively counter it, light nodes can blacklist the public key of the node.
In summary, what this implies is that, in contrast to Bitcoin, Ethereum is likely to remain entirely secure, including against fraudulent issuance attacks, even if only a minimal number of full nodes are present; as long as at least one full node operates honestly, validating blocks and publishing challenges when necessary, light clients can depend on it to identify erroneous blocks. It is worth mentioning that there exists a singular vulnerability in this protocol: you must now know all transactions in advance before processing a block, and the incorporation of new transactions demands considerable effort to re-calculate intermediate stack trace values, thus the process of generating a block will be less efficient. However, it is probably feasible to patch the protocol to bypass this issue, and if it is attainable then BP2.1 will incorporate such a fix.
Blockchain-based Mining
While we have not finalized the specifics of this, Ethereum will likely utilize something akin to the following for its mining algorithm:
-
Define H[i] = sha3(sha3(block header without nonce) ++ nonce ++ i) for i in [0 …16]
-
Define N as the number of transactions within the block.
-
Define T[i] as the (H[i] mod N)th transaction in the block.
-
Define S as the state of the parent block.
-
Apply T[0] … T[15] to S, which results in a new state denoted as S’.
-
Define x = sha3(S’.root)
-
The block is considered valid if x * difficulty
This possesses the following characteristics:
-
This is exceedingly memory-intensive, even more so than Dagger, as mining effectively necessitates access to the entire blockchain. However, it can be parallelized using shared disk space, hence it is likely to be GPU-centric rather than CPU-centric as Dagger originally intended.
-
It is memory-efficient to validate, since a proof of legitimacy comprises merely the relatively small subset of Patricia nodes utilized during the processing of T[0] … T[15]
-
All miners essentially must operate as full nodes; querying the network for block data for each nonce is excessively slow. Consequently, there will be a greater number of full nodes in Ethereum compared to Bitcoin.
-
As a result of (3), one of the primary incentives to utilize centralized mining pools, which facilitate miners in functioning without downloading the complete blockchain, is rendered obsolete. The other significant rationale for employing mining pools, the fact that they stabilize the payout rate, can be achieved just as effortlessly with the decentralized p2pool (which we will likely end up advancing with development resources)
-
ASICs designed for this mining method simultaneously function as ASICs for transaction processing, thus Ethereum ASICs will aid in resolving the scalability challenge.
At this juncture, there is genuinely only one optimization that remains: discovering a method to overcome the challenge that mandates every full node to process every transaction. This presents a complex problem; a truly scalable and efficient solution will require some time to develop. Nevertheless, this represents a robust foundation and may one day become one of the critical components of a final resolution.