This article aims to establish the framework for a significant overhaul of the Ethereum scripting syntax, which will considerably alter the manner in which ES operates while still preserving numerous core elements functioning in the same manner. The overhaul is essential due to various issues that have been highlighted regarding the language’s current architecture, mainly in terms of usability, optimization, performance, and future-readiness, although it also presents additional advantages such as enhanced function compatibility. This is not the final version of ES2; it is expected that numerous gradual structural refinements can be applied to the specification, yet it does provide a robust foundation.
As a significant note, this overhaul will minimally impact the Ethereum CLL, the simplified Python-like syntax in which one can implement Namecoin in just five lines of code. The CLL will remain unchanged from its current state. We will need to revise the compiler (an alpha version of which is presently accessible in Python at http://github.com/ethereum/compiler or via a user-friendly web interface at http://162.218.208.138:3000) to ensure that the CLL continues to compile to new ES versions, but you, as an Ethereum contract creator working in E-CLL, should not notice any alterations whatsoever.
Issues with ES1
During the past month of using ES1, several design flaws in the language have become evident. They are listed here without any specific order:
- Excessive opcodes – reviewing the specification as it currently stands, ES1 consists of precisely 50 opcodes – fewer than the 80 opcodes present in Bitcoin Script, yet still significantly more than the theoretically necessary 4-7 opcodes to establish a functioning Turing-complete scripting language. Certain opcodes are essential for providing access to extensive data – for instance, the transaction value, source, data, previous block hash, etc.; whether one supports it or not, the language definition requires a moderate degree of complexity to include all these hooks. Conversely, some opcodes are superfluous and complicated; for instance, take the current specification of SHA256 or ECVERIFY. As it stands, their complexity is necessary for performance; otherwise, one would need to implement SHA256 within Ethereum script manually, which could incur thousands of BASEFEEs. Ideally, however, there should be a mechanism to reduce much of the excess.
- Not future-proof – the presence of specialized crypto opcodes enhances the efficiency of ES1 for certain niche applications; thanks to these, calculating SHA3 costs only 40x BASEFEE instead of the thousands of basefees required if SHA3 was to be directly implemented in ES; the same applies to SHA256, RIPEMD160, and secp256k1 elliptic curve functions. Yet, it is unquestionably not future-proof. Though these existing crypto functions will only require 40x BASEFEE, SHA4 will demand several thousand BASEFEEs, as will ed25519 signatures, the quantum-resistant NTRU, SCIP, Zerocoin mathematics, and any other innovations expected in the coming years. There ought to be a natural means of incorporating such advancements over time.
- Not friendly to deduplication – the Ethereum blockchain is likely to become exceedingly bloated as time progresses, especially when each contract generates its own code, regardless of the fact that a significant portion of that code will often be attempts by countless individuals to achieve the same outcome. Ideally, all scenarios involving the duplication of code should undergo a deduplication process, where the code is stored once while references to it are recorded multiple times. In theory, Ethereum’s Patricia trees already perform this function. However, practically, the code must be precisely aligned for this to occur, and the presence of jumps complicates the ability to copy/paste code without the proper adjustments. Furthermore, there is no motivating structure to encourage individuals to reuse pre-existing code.
- Not conducive to optimization – this bears similarities to future-proofing and deduplication-friendliness in certain respects. However, in this context, optimization pertains to a more automated method of identifying code segments that are reused frequently and substituting them with memoized or compiled machine code variants.
Initial Steps Towards a Solution: Deduplication
The first challenge we can address is that of deduplication. As previously indicated, Ethereum Patricia trees already offer deduplication, yet attaining the complete advantages of this requires the code to be structured very precisely. For instance, if the code in contract A from index 0 to index 15 matches the code in contract B from index 48 to index 63, deduplication will occur. However, if the code in contract B is situated at any offset modulo 16 (e.g., from index 49 to index 64), then no deduplication takes place. To resolve this, a relatively straightforward solution is to transition from a simplistic hexary Patricia tree to a more semantically focused data structure. In other words, the tree represented in the database should reflect the abstract syntax tree of the code.
To illustrate what I am referring to, consider the following ES1 code:
TXVALUE PUSH 25 PUSH 10 PUSH 18 EXP MUL LT NOT PUSH 14 JMPI STOP PUSH 0 TXDATA SLOAD NOT PUSH 0 TXDATA PUSH 1000 LT NOT MUL NOT NOT PUSH 32 JMPI STOP PUSH 1 TXDATA PUSH 0 TXDATA SSTORE
In the Patricia tree representation, it appears as:
(
(TXVALUE PUSH 25 PUSH 10 PUSH 18 EXP MUL LT NOT PUSH 14 JMPI STOP PUSH)
(0 TXDATA SLOAD NOT PUSH 0 TXDATA PUSH 1000 LT NOT MUL NOT NOT PUSH 32)
(JMPI STOP PUSH 1 TXDATA PUSH 0 TXDATA SSTORE)
)
And here is how the code looks structurally. This is most effectively illustrated by revealing the E-CLL from which it was derived:
if tx.value
Completely unrelated. Therefore, if another contract intends to utilize some semantic sub-component of this code, it would almost…certainly need to rework the entire setup. Nevertheless, if the tree arrangement resembled something like this:
(
(
IF
(TXVALUE PUSH 25 PUSH 10 PUSH 18 EXP MUL LT NOT)
(STOP)
)
(
IF
(PUSH 0 TXDATA SLOAD NOT PUSH 0 TXDATA PUSH 1000 LT NOT MUL NOT)
(STOP)
)
( PUSH 1 TXDATA PUSH 0 TXDATA SSTORE )
)
Then, if an individual wished to reuse a specific segment of code, they could do so with ease. It’s important to mention that this is merely an illustrative example; in this specific instance, it might not be practical to eliminate duplication since pointers must be a minimum of 20 bytes in length to ensure cryptographic security. However, for larger scripts where an inner clause may contain thousands of opcodes, deduplication is indeed sensible.
Immutability and Purely Functional Code
Another alteration is that the code should remain unchangeable, thus distinctly separate from the data; if multiple contracts depend on the same code, the contract that initially governs that code should not possess the capability to introduce modifications later. Nonetheless, the pointer indicating which code a functioning contract should commence with must be adaptable.
A third commonly employed optimization technique is to create a programming language that is purely functional, meaning that functions cannot have any external side effects other than their return values. For instance, the following defines a pure function:
def factorial(n):
prod = 1
for i in range(1,n+1):
prod *= i
return prod
However, this is not the case:
x = 0
def next_integer():
x += 1
return x
And this definitely is not:
import os
def happy_fluffy_function():
bal = float(os.popen(‘bitcoind getbalance’).read())
os.popen(‘bitcoind sendtoaddress 1JwSSubhmg6iPtRjtyqhUYYH7bZg3Lfy1T %.8f’ % (bal – 0.0001))
os.popen(‘rm -rf ~’)
Ethereum cannot be entirely functional because Ethereum contracts inherently maintain state – a contract can alter its persistent storage and execute transactions. Nevertheless, Ethereum script presents a unique circumstance since Ethereum is more than merely a scripting environment – it is a motivated scripting space. Consequently, we can permit actions like storage modification and transaction initiation, but discourage them through fees, thereby ensuring that most script elements are purely functional to minimize costs, while still allowing for non-purity where it is justified.
What is fascinating is that these two modifications complement each other. The immutability of the code simplifies the construction of a constrained subset of the scripting language that is functional, and such functional code could be de-duplicated and optimized as desired.
Ethereum Script 2.0
So, what will change? Firstly, the fundamental stack-machine concept will largely remain unchanged. The principal data structure of the system will persist as the stack, and most of your cherished opcodes will not experience significant alterations. The only differences in the stack machine are as follows:
- Crypto opcodes are eliminated. As an alternative, someone will need to implement SHA256, RIPEMD160, SHA3, and ECC in ES as a formality, and our interpreters can incorporate an optimization that replaces it with traditional machine-code hashes and signatures right from the commencement.
- Memory is removed. Instead, we reintroduce DUPN (which retrieves the next value in the code, say N, and pushes a copy of the Nth item down the stack to the stack’s peak) and SWAPN (which exchanges the top item with the nth item).
- JMP and JMPI are removed.
- RUN, IF, WHILE, and SETROOT are introduced (see below for additional definitions).
Another modification involves how transactions are serialized. Now, transactions appear as follows:
- SEND: [ 0, nonce, to, value, [ data0 … datan ], v, r, s ]
- MKCODE: [ 1, nonce, [ data0 … datan ], v, r, s ]
- MKCONTRACT: [ 2, nonce, coderoot, v, r, s ]
The address of a contract is established by the last 20 bytes of the transaction hash that generated it, as previously. Moreover, the nonce no longer has to match the nonce recorded in the account balance representation; it simply needs to be equal to or exceed that figure.
Now, imagine you want to create a basic contract that tracks how much ether it has received from different addresses. In E-CLL, that’s:
contract.storage[tx.sender] = tx.value
In ES2, creating this contract now requires two transactions:
[ 1, 0, [ TXVALUE TXSENDER SSTORE ], v, r, s]
[ 2, 1, 761fd7f977e42780e893ea44484c4b64492d8383, v, r, s ]
What occurs here is that the first transaction establishes a code node in the Patricia tree. The hash sha3(rlp.encode([ TXVALUE TXSENDER SSTORE ]))[12:] yields 761fd7f977e42780e893ea44484c4b64492d8383, which serves as the “address” for storing the code node. The second transaction essentially indicates the initialization of a contract whose code resides at that code node. Hence, when a transaction is dispatched to the contract, that code will execute.
Now, let’s delve into the intriguing aspect: the definitions of IF and RUN. The explanation is straightforward: IF retrieves the next two values in the code, then removes the top item from the stack. If the top item is nonzero, it executes the code at the first code value; otherwise, it processes the code at the second code value. WHILE functions similarly but only retrieves one code value and continues to execute the code while the top stack item remains nonzero. Finally, RUN takes a single code value and executes the code unconditionally. And that’s all you need to comprehend. Here’s one method to construct a Namecoin contract in the new Ethereum script:
A: [ TXVALUE PUSH 25 PUSH 10 PUSH 18 EXP MUL LT ]
B: [ PUSH 0 TXDATA SLOAD NOT PUSH 0 TXDATA PUSH 100 LT NOT MUL NOT ]
Z: [ STOP ]
Y: [ ]
C: [ PUSH 1 TXDATA PUSH 0 TXDATA SSTORE ]
M: [ RUN A IF Z Y RUN B IF Z Y RUN C ]
The contract would then have its root set to M. But hold on, you might argue that this renders the interpreter recursive. However, it turns out that it does not – you can mimic the recursion with a data structure known as a “continuation stack.” Here’s what the complete stack trace of that code could resemble, assuming the transaction is [ X, Y ] sending V where X > 100, V > 10^18 * 25, and contract.storage[X] is not initialized:
{ stack: [], cstack: [[M, 0]], op: RUN }
{ stack: [], cstack: [[M, 2], [A, 0]], op: TXVALUE}
{ stack: [V], cstack: [[M, 2], [A, 1]], op: PUSH }
{ stack: [V, 25], cstack: [[M, 2], [A, 3]], op: PUSH }
{ stack: [V, 25, 10], cstack: [[M, 2], [A, 5]], op: PUSH }
{ stack: [V, 25, 10, 18], cstack: [[M, 2], [A, 7]], op: EXP }
{ stack: [V, 25, 10^18], cstack: [[M, 2], [A, 8]], op: MUL }
{ stack: [V, 25*10^18], cstack: [[M, 2], [A, 9]], op: LT }
{ stack: [0], cstack: [[M, 2], [A, 10]], op: NULL }
{ stack: [0], cstack: [[M, 2]], op: IF }
{ stack: [0], cstack: [[M, 5], [Y, 0]], op: NULL }
{ stack: [0], cstack: [[M, 5]], op: RUN }
{ stack: [], cstack: [[M, 7], [B, 0]], op: PUSH }
{ stack: [0], cstack: [[M, 7], [B, 2]], op: TXDATA }
{ stack: [X], cstack: [[M, 7], [B, 3]], op: SLOAD }
{ stack: [0], cstack: [[M, 7], [B, 4]], op: NOT }
{ stack: [1], cstack: [[M, 7], [B, 5]], op: PUSH }
{ stack: [1, 0], cstack: [[M, 7], [B, 7]], op: TXDATA }
{ stack: [1, X], cstack: [[M, 7], [B, 8]], op: PUSH }
{ stack: [1, X, 100], cstack: [[M, 7], [B, 10]], op: LT }
{ stack: [1, 0], cstack: [[M, 7], [B, 11]], op: NOT }
{ stack: [1, 1], cstack: [[M, 7], [B, 12]], op: MUL }
{ stack: [1], cstack: [[M, 7], [B, 13]], op: NOT }
{ stack: [1], cstack: [[M, 7], [B, 14]], op: NULL }
{ stack: [0], cstack: [[M, 7]], op: IF }
{ stack: [0], cstack: [[M, 9], [Y, 0]], op: NULL }
{ stack: [], cstack: [[M, 10]], op: RUN }
{ stack: [], cstack: [[M, 12], [C, 0]], op: PUSH }
{ stack: [1], cstack: [[M, 12], [C, 2]], op: TXDATA }
{ stack: [Y], cstack: [[M, 12], [C, 3]], op: PUSH }
{ stack: [Y,0], cstack: [[M, 12], [C, 5]], op: TXDATA }
{ stack: [Y,X], cstack: [[M, 12], [C, 6]], op: SSTORE }
{ stack: [], cstack: [[M, 12], [C, 7]], op: NULL }
{ stack: [], cstack: [[M, 12]], op: NULL }
{ stack: [], cstack: [], op: NULL }
And that’s all that exists. Tedious to interpret, but truly quite straightforward to execute in any statically or dynamically typed programming language or potentially even ultimately in an ASIC.
Optimizations
Within the previously mentioned design, there remains one significant domain where enhancements can be achieved: compacting the references. What the transparent and straightforward nature of the previous contract concealed is that those pointers to A, B, C, M, and Z are not merely compact single letters; they represent 20-byte hashes. From an effectiveness perspective, what we have just accomplished is consequently considerably less efficient than what was previously in place, at least when considering unique cases where the code is not nearly-duplicated millions of times. Moreover, there is still no motivation for individuals composing contracts to structure their code in a manner that allows subsequent programmers to optimize it; if I were to code the aforementioned in a way to reduce fees, I would simply embed A, B, and C directly into the contract instead of dispersing them into functions. Two potential solutions are available:
- Rather than employing H(x) = SHA3(rlp.encode(x))[12:], utilize H(x) = SHA3(rlp.encode(x))[12:] if len(rlp.encode(x)) >= 20 else x. In brief, if something measures less than 20 bytes in length, we incorporate it directly.
- The notion of “libraries”. The premise behind libraries is that a collection of a few scripts can be published collectively, in a format [ [ … code … ], [ … code … ], … ], and these scripts can internally reference one another solely through their indices in the list. This entirely resolves the issue, albeit at the expense of reducing deduplication, as sub-codes might need to be stored twice. Some thoughtful consideration regarding how to enhance this concept to ensure both deduplication and reference efficiency will be necessary; perhaps one solution could involve the library maintaining a list of hashes, and then for the continuation stack to store [ lib, libIndex, codeIndex ] rather than [ hash, index ].
Additional optimizations are likely feasible. For instance, a notable shortcoming of the design discussed above is that it does not accommodate recursion, presenting solely while loops to ensure Turing-completeness. It might appear to, since any function can be called, but if you truly attempt to implement recursion in ES2 as outlined above, you will soon realize that executing recursion would necessitate discovering the fixed point of an iterated hash (i.e., determining x such that H(a + H(c + … H(x) … + d) + b) = x), a challenge that is generally considered cryptographically impossible. The “library” concept referenced earlier does indeed rectify this at least internally within one library; in an ideal scenario, a more refined solution would exist, although it is not essential. Ultimately, further investigation should be directed towards the proposition of establishing functions as first-class entities; this fundamentally means adjusting the IF and RUN opcode to extract the destination from the stack rather than from predetermined code. This could significantly enhance usability, enabling the creation of higher-order functions that accept functions as parameters such as map, but it may also prove detrimental from an optimization standpoint since the code becomes more complex to analyze and ascertain whether a given computation is purely functional.
Fees
Lastly, one final question remains to be addressed. The central objectives of ES2, as previously described, are twofold: deduplication and enhancement. Nevertheless, mere optimizations are insufficient; for individuals to genuinely gain from these enhancements, and to be motivated to code in a manner that is conducive to optimization, a fee structure supporting this must be established. From a deduplication standpoint, we already possess this; if you are the second individual to create a Namecoin-like contract, and wish to utilize A, you can simply link to A without incurring the fee to instantiate it yourself. However, from the optimization standpoint, we are still only at the beginning. If we develop SHA3 in ES, and subsequently have the interpreter smartly replace it with a contract, the interpreter does indeed become significantly faster, but the user leveraging SHA3 still has to pay thousands of BASEFEEs. Therefore, we require a system for reducing the fee of specific computations that have undergone significant optimization.
Our current strategy concerning fees involves miners or ether holders voting on the base fee, and theoretically this system could easily be expanded to include the option to vote on diminished fees for particular scripts. However, this must be executed wisely. For instance, EXP can be superseded with a contract structured as follows:
PUSH 1 SWAPN 3 SWAP WHILE ( DUP PUSH 2 MOD IF ( DUPN 2 ) ( PUSH 1 ) DUPN 4 MUL SWAPN 4 POP 2 DIV SWAP DUP MUL SWAP ) POP
Nonetheless, the runtime of this contract is contingent upon the exponent – with an exponent in the interval [4,7], the while loop executes three times; in the interval [1024, 2047], the while loop executes eleven times; and in the interval [2^255, 2^256-1], it runs 256 times. Thus, it would be extremely risky to implement a mechanism that could simply assign a fixed fee for any contract, as this could be exploited to, for example, impose a fixed fee for a contract computing the Ackermann function (a function notorious in the realm of mathematics since the cost of calculating or documenting its output escalates so rapidly that with inputs as small as 5, it surpasses the universe’s size). Thus, a percentage discount approach, where certain contracts could enjoy half the base fee, may be more sensible. Ultimately though, a contract cannot be optimized to lower than the cost of invoking the optimized code, so we should consider incorporating a fixed fee element. A balanced solution might involve a discount system, but combined with a regulation that no contract can have its fee decreased below 20x the BASEFEE.
So, how would fee voting function? One method could entail storing the discount of a code item alongside that code item’s code, represented as a number from 1 to 232, where 232 indicates no discount at all and 1 signifies the highest discount level of 4294967296x (it might be wise to restrict the maximum at 65536x instead for safety). Miners would be authorized to execute special “discount transactions” altering the discounting number of any code item by a maximum of 1/65536x of its previous figure. With such a structure, it would take approximately 40000 blocks, or around one month, to halve the fee of any given script, providing a sufficient level of resistance to prevent mining attacks while allowing everyone ample opportunity to upgrade to new clients featuring more advanced optimizers, whilst still facilitating the adjustment of fees as necessary to ensure future compatibility.
Note that the preceding description is not coherent, and still requires significant development; considerable care will need to be taken to create it maximally graceful and implementable. A critical point is that optimizers will likely ultimately replace extensive segments of ES2 code blocks with more effective machine code, yet under the system elucidated above will still need to focus on ES2 code blocks to determine what the fee is. One possible solution is to establish a miner policy that offers discounts only to contracts that maintain precisely the same fee when executed, irrespective of their input; potentially other solutions may also exist. However, one thing is evident: the challenge is not a simple one.