I began crafting a post that outlined a “roadmap” for Ethereum 1.x research and the journey toward stateless Ethereum, and came to the realization that it’s not truly a roadmap at all —— at least not in the manner we’re accustomed to encountering in a product or a corporation. The 1.x team, while working towards a unified objective, is an assorted group of developers and researchers working autonomously on intricately interconnected subjects. As a result, there is no “official” roadmap to reference. However, it’s not complete anarchy! There exists an acknowledged “order of operations”; certain activities must precede others, specific solutions cannot coexist, and other tasks may be advantageous but are not crucial.
Then what might serve as a more fitting analogy for achieving stateless Ethereum, if not a roadmap? It took me a moment, but I think I’ve come up with a solid one: Stateless Ethereum is akin to the ‘full spec’ in a technology tree.
Some readers may readily comprehend this comparison. If you “understand,” feel free to bypass the next few paragraphs. However, if you’re not like me and don’t typically view the world through the lens of video games: A technology tree is a familiar mechanic in gaming that permits players to unlock and enhance new abilities, technologies, or spells that are arranged in a loose hierarchy or tree format.
Generally, there exists some form of XP (experience points) that can be “spent” to obtain elements in the tree (‘spec’), which subsequently unlocks more advanced components. Occasionally, you may need to gather two unrelated basic elements to access a third, more sophisticated one; sometimes unlocking one basic ability opens up multiple new options for the next enhancement. A significant part of the fun as a player lies in selecting the correct route in the technology tree that corresponds with your skills, objectives, and preferences (do you pursue full spec in Warrior, Thief, or Mage?).
This is, surprisingly accurately, what we encounter in the 1.x research environment: A loose hierarchy of technical themes to explore, with limited time/expertise available for research, implementation, and testing. Just like in a well-crafted RPG, experience points are limited: there’s only so much that a small group of capable and motivated individuals can achieve within a year or two. Depending on delivery requirements, it might be prudent to postpone more ambitious or abstract enhancements in favor of a more direct route to the final spec. Everyone shares the same ultimate goal, but the pathway chosen to reach it will depend on which solutions are thoroughly researched and utilized.
Alright, so I will present my rough sketch of the tree, discuss how it’s organized, and then briefly explain each enhancement and its relation to the whole. The final “full-spec” enhancement in the technology tree is “Stateless Ethereum”. In other words, a fully operational Ethereum mainnet that accommodates full-state, partial-state, and zero-state nodes; that effectively and reliably disseminates witnesses and state information; and that is fundamentally prepared to continue scaling until the connection to Eth2.0 is constructed and ready to integrate the legacy chain.
Note: As mentioned earlier, this is not an ‘official’ work scheme. It’s my best attempt at compiling and organizing the key features, milestones, and decisions that the 1.x working group must reach to make Stateless Ethereum a reality. Feedback is encouraged, and updated/revised versions of this plan will definitely arise as research progresses.
The diagram should be interpreted from left to right: elements in purple depicted on the left side are ‘foundational’ and must be developed or resolved prior to subsequent enhancements to the right. Elements with a greenish tint are colored to signify that they are in some manner “bonus” items — desirable though not strictly vital for progression, and perhaps less explicitly understood in the realm of research. The larger pink shapes symbolize crucial milestones for Stateless Ethereum. All four primary milestones must be “unlocked” before a full transition to Stateless Ethereum can be initiated.
The Witness Format
There has been extensive discussion regarding witnesses in the context of stateless Ethereum, so it should not be surprising that the first significant milestone I’ll highlight is a finalized witness format. This entails determining with some assurance the structure of the state trie and associated witnesses. The establishment of a specification or reference implementation could be perceived as the juncture at which ETH 1.x research “levels up”; coalescing around a new representation of state will assist in defining and concentrating the work needed to reach other milestones.
Binary Trie (or “trie, trie again”)
Transitioning Ethereum’s state to a Binary Trie structure is pivotal for achieving small enough witness sizes to be circulated around the network without encountering bandwidth/latency hurdles. As noted in the last research call, reaching a Binary Trie will necessitate a commitment to one of two mutually exclusive strategies:
-
Progressive. Like the Ship of Theseus, the current hexary state trie would undergo transformation piece-by-piece over an extended timeframe. Any transaction or EVM execution that influences parts of state would, by this strategy, automatically encode changes to state into the new binary form. This implies the adoption of a ‘hybrid’ trie structure that will retain inactive portions of state in their existing hexary representation. The process would effectively be ongoing, and complex for client developers to implement, but would mostly shield users and higher-layer developers from the alterations occurring beneath the surface in layer 0.
-
Clean-cut. Perhaps more aligned with the significance of the fundamental trie change, a clean-cut transition strategy would establish an explicit timeline of transition across multiple hard forks, compute a fresh binary trie representation of the state at that moment, and then proceed in binary form once the new state has been determined. While more straightforward from an implementation standpoint, a clean-cut necessitates coordination from all node operators and would almost certainly involve some (limited) disruption to the network, affecting developer and user experiences duringthe shift. Conversely, the procedure may yield some valuable perspectives for orchestrating the further-off transition to Eth2.
Irrespective of the transition approach adopted, a binary trie serves as the foundation for the witness configuration, i.e., the sequence and hierarchy of hashes constituting the state trie. Absent additional optimization, rough estimates (January 2020) suggest witness sizes range between ~300-1,400 kB, reduced from ~800-3,400 kB in the hexary trie structure.
Code Division (merkleization)
A significant element of a witness is the accompanying code. Without code division, a transaction that included a contract invocation would necessitate the complete bytecode of that contract to confirm its codeHash. That could represent a substantial amount of information, contingent on the contract. Code ‘merkleization’ is a technique that fragments contract bytecode such that only the segment of the code invoked is needed to create and verify a witness for the transaction. This represents one method for substantially diminishing the typical size of witnesses. There are two approaches to fragmenting contract code, and currently, it is unclear whether the two are mutually exclusive.
- “Static” division. Segmenting contract code into fixed sizes around 32 bytes. For the merkleized code to function properly, static segments must also consist of additional meta-data accompanying each fragment.
- “Dynamic” division. Partitioning contract code into segments based on the content of the code itself, cutting at specific instructions (JUMPDEST) contained therein.
At a cursory glance, the “static” approach to code division appears preferable to evade leaky abstractions, i.e., to stop the content of the merkleized code from impacting the lower-level division, as might occur in the “dynamic” scenario. Nonetheless, both alternatives have yet to be extensively evaluated, thus both remain under consideration.
ZK witness compression
Approximately 70% of a witness consists of hashes. It might be feasible to employ a ZK-STARK proofing technique to compress and authenticate those intermediate hashes. As with much of the zero-knowledge developments today, precisely how this would function, or even whether it would work at all, remains ambiguous and not easily clarified. Thus, this is somewhat of a side endeavor, or non-essential enhancement to the primary technological advancement path.
EVM Semantics
We have briefly discussed the avoidance of “leaky abstraction”, which is particularly pertinent to this milestone, so I will take a short detour here to elucidate why this idea is significant. The EVM is an abstracted component within the broader Ethereum protocol. Theoretically, details regarding the internal workings of the EVM should not influence the behavior of the larger system, and modifications to the system external to the abstraction should have no impact on anything contained within it.
In practice, however, certain features of the protocol directly affect elements within the EVM. These are clearly reflected in gas expenses. A smart contract (inside the EVM abstraction) is subject to gas costs associated with various stack operations (outside the EVM abstraction) through the GAS opcode. A modification in gas scheduling can directly impact the performance of specific contracts, but this depends on the context and how the contract utilizes the information to which it has access.
Due to the ‘leaks’, revisions to gas scheduling and EVM execution must be approached cautiously, as they might inadvertently affect smart contracts. This is a reality that must be addressed; it is quite challenging to engineer systems with zero abstraction leakage, and in any case, the 1.x researchers do not have the luxury of reengineering anything from scratch — They must operate within the current Ethereum protocol, which is somewhat prone to leakage in the virtual state machine abstraction.
Returning to the primary subject: The implementation of witnesses will necessitate adjustments in gas scheduling. Witnesses must be produced and communicated across the network, and that process needs to be factored into EVM operations. The subjects related to this milestone pertain to what those costs and incentives may be, how they are assessed, and how they will be realized with minimal effects on higher layers.
Witness Indexing / Gas accounting
There is likely much more complexity to this section than can feasibly fit within a handful of sentences; I’m certain we’ll explore it more deeply on a future occasion. For now, comprehend that every transaction will bear responsibility for a minor part of the complete block’s witness. Generating a block’s witness entails certain computations that will be executed by the block’s miner, and thus will need to incur an associated gas cost, paid for by the transaction’s initiator.
As multiple transactions may interact with the same component of the state, it is unclear what the optimal way to approximate the gas expenses related to witness generation is at the moment of transaction announcement. If transaction owners cover the entire cost of witness production, scenarios may arise where the same section of a block witness is funded multiple times by ‘overlapping’ transactions. While this isn’t inherently negative, it introduces real changes to gas incentives that need to be better comprehended.
Regardless of the associated gas costs, the witnesses themselves will need to be integrated into the Ethereum protocol, and will likely require incorporation as a standard aspect of each block, possibly including something as straightforward as a witnessHash included in each block header.
UNGAS / Versionless Ethereum
This category of upgrades is primarily independent of Stateless Ethereum and pertains to gas expenditures in the EVM, as well as mitigating those abstraction leaks I previously mentioned. UNGAS is an abbreviation for “unobservable gas”, and it is a modification that would explicitly forbid contracts from utilizing the GAS opcode, to prevent any assumptions regarding gas costs from being made by smart contract developers. UNGAS is part of a series of proposals from the Ethereum core paper topatch up some of those openings, rendering all upcoming adjustments to gas scheduling simpler to execute, particularly focusing on modifications regarding witnesses and Stateless Ethereum.
State Accessibility
Stateless Ethereum will not eliminate state altogether. Instead, it will render state optional, affording clients a measure of flexibility concerning how much state they monitor and process independently. Consequently, the entire state must be made reachable somewhere, enabling nodes that wish to download all or portions of the state to do so.
In a sense, current paradigms like fast sync already facilitate this functionality. However, the advent of zero-state and partial-state nodes complicates the situation for new nodes striving to synchronize. At present, a new node can expect to acquire the state from any healthy peers it connects with because every node maintains a copy of the current state. This presumption, however, becomes uncertain if some of the peers are potentially zero-state or partial-state nodes.
The pre-conditions for this milestone pertain to the methods by which nodes communicate to one another what portions of state they possess, as well as the techniques for reliably transmitting these portions across a perpetually changing peer-to-peer network.
Network Dissemination Rules
The diagram below illustrates a theoretical network topology that could exist within stateless Ethereum. In such a network, nodes must be capable of positioning themselves according to which portions of state they wish to retain, if any.
Enhancements such as EIP #2465 fall within the broad domain of network dissemination rules: New message types within the network protocol that offer additional information concerning what details nodes possess, and outline how that information should be transmitted to other nodes in potentially awkward or constrained network topologies.
Data Transmission Model / DHT Routing
If advancements like the aforementioned message types gain acceptance and are executed, nodes will be equipped to discern what portions of state are held by connected peers. What occurs if none of the connected peers possess a required segment of state?
Data transmission presents a somewhat open-ended dilemma with numerous potential resolutions. We might consider turning to more ‘mainstream’ solutions, rendering some or all of the state accessible via HTTP requests from a cloud server. A more ambitious alternative might involve adopting characteristics from related peer-to-peer data transmission frameworks, enabling requests for state segments to be proxied through linked peers, identifying their proper destinations via a Distributed Hash Table. These two extremes are not inherently discordant; ¿Por qué no los dos?
State Partitioning
One strategy for enhancing state distribution is to divide the complete state into more manageable segments (tiles), stored within a networked cache capable of supplying state to nodes in the network, thereby reducing the load on the full nodes providing state. The premise is that even with relatively large tile sizes, it is probable that some of the tiles would remain unchanged from block to block.
The geth team has conducted experiments that indicate state partitioning is viable for augmenting state snapshot availability.
Chain Trimming
Much has been elaborated on chain trimming already, so a more elaborate explanation is not essential. It is crucial to explicitly mention, however, that full nodes can prudently prune historical data such as transaction receipts, logs, and historical blocks only if historical state snapshots can be made readily available to new full nodes, through mechanisms like state partitioning and/or a DHT routing scheme.
Network Protocol Specification
Finally, the complete landscape of Stateless Ethereum is becoming clear. The three phases of Witness Format, EVM Semantics, and State Accessibility collectively facilitate a comprehensive outline of a Network Protocol Specification: The clearly defined updates that should be encoded into every client execution, and deployed during the next hard fork to transition the network into a stateless framework.
We’ve traversed substantial territory in this article, yet there remain a few untidy details from the diagram that merit further clarification:
Formal Stateless Specification
Ultimately, it is not a necessity for the complete stateless protocol to be officially delineated. It is conceivable that a reference implementation be developed and utilized as the foundation for all clients to replicate. Nonetheless, there are undeniable advantages to formulating a “formalized” specification for witnesses and stateless clients. This would essentially function as an extension or appendix that would fit within the Ethereum Yellow Paper, articulating with precision the expected behavior of an Ethereum stateless client implementation.
Beam Sync, Red Queen’s Sync, and Additional State Sync Optimizations
Synchronization strategies are not principal to the network protocol, but rather are implementation specifics that influence how efficiently nodes perform the protocol. Beam sync and Red Queen’s sync are interrelated strategies for accumulating a local copy of state from witnesses. Some effort should be directed toward enhancing these strategies and adjusting them for the final ‘version’ of the network protocol, when that is established and executed.
For the time being, they are left as ‘bonus’ elements in the tech tree, as they can be developed independently of other matters, and because details of their execution hinge on more fundamental decisions like witness format. It’s notable that these extra-protocol subjects are, by virtue of their autonomy from ‘core’ alterations, a suitable avenue for implementing and testing the more foundational advancements on the left side of the tree.
Conclusion
Well, that was quite an extensive journey! I trust that the topics and milestones, along with the overall concept of the “tech tree,” are beneficial in organizing the breadth of “Stateless Ethereum” research.
The structure of this tree is something I hope to keep current as developments unfold. As previously mentioned, it’s not an ‘official’ or ‘final’ scope of work; it’s merely the most precise outline we have at the present time. Please feel free to reach out if you have suggestions on enhancements or modifications.
As always, if you have inquiries, requests for new subjects, or wish to engage in state Ethereum research, don’t hesitate to introduce yourself on ethresear.ch, and/or contact @gichiba or @JHancock on Twitter.