Throughout the last fortnight, our main objective has been to ensure all clients are upgraded to PoC5 compatibility, and it has undeniably been a lengthy journey. Among the modifications to the VM are:
- The novel init/code mechanism: essentially, upon creating a contract, the supplied code will run instantly, and the return value of that code will define the contract’s actual code. This enables us to incorporate contract initialization code while maintaining the same format of [nonce, price, gas, to, value, data] for both transactions and contract establishment, also simplifying the process of generating new contracts via forwarding contracts
- Reordering transaction and contract information: the sequence is now [nonce, price, gas, to, value, data] for transactions and [gas, to, value, datain, datainsz, dataout, dataoutsz] for messages. It’s important to note that Serpent keeps the send(to, value, gas), o = msg(to, value, gas, datain, datainsz) and o = msg(to, value, gas, datain, datainsz, dataoutsz) parameters intact.
- Fee revisions: creating a transaction now incurs a fee of 500 gas, and numerous other fees have been modified.
- The CODECOPY and CALLDATACOPY opcodes: CODECOPY takes code_index, mem_index, len as inputs, and duplicates the code from code_index … code_index+len-1 to memory mem_index … mem_index+len-1. These are particularly valuable when paired with init/code. Additionally, CODESIZE has been introduced.
The most significant alterations, however, have pertained to the architecture surrounding the protocol. On the GUI front, the C++ and Go clients are progressing swiftly, and we can expect more updates from that arena soon. If you have been closely monitoring Ethereum, you may have spotted Denny’s Lotto, a complete implementation of a lottery, along with a GUI, crafted and executed inside the C++ client. Moving forward, the C++ client will pivot towards being more developer-centric, whereas the Go client will begin to concentrate on being a user-facing application (or more accurately, a meta-application). On the compiler front, Serpent has experienced a variety of notable enhancements.
To start, the code. You can inspect the inner workings of the Serpent compiler and will notice all the functions available, accompanied by their accurate translations into EVM code. For instance, we have:
72: [‘access’, 2, 1,
73: [”, ”, 32, ‘MUL’, ‘ADD’, ‘MLOAD’]],
This indicates that what access(x,y) truly performs under the surface is it recursively compiles whatever x and y may actually represent, then loading the memory at index x + y * 32; thus, x acts as the pointer to the beginning of the array, while y serves as the index. This code architecture has existed since PoC4, but I have now enhanced the meta-language used to articulate translations even further, to encompass even if, while, and init/code in this formulation (previously they were special instances); now, only set and seq persist as special cases, and if desired, I could even eliminate seq by re-implementing it as a rewrite rule.
The most considerable changes thus far have been related to PoC5 compatibility. For example, upon executing serpent compile_to_assembly ‘return(msg.data[0]*2)’, you would observe:
[“
The actual script present is simply:
[2, 0, “CALLDATALOAD”, “MUL”, “MSIZE”, “SWAP”, “MSIZE”, “MSTORE”, 32, “SWAP”, “RETURN”]
To understand what is occurring here, let’s assume that an incoming message has its initial value as 5. Thus, we have:
2 -> Stack: [2]
0 -> Stack: [2, 0]
CALLDATALOAD -> Stack: [2,5]
MUL -> Stack: [10]
MSIZE -> Stack: [10, 0]
SWAP -> Stack: [0, 10]
MSIZE -> Stack: [0, 10, 0]
MSTORE -> Stack: [0], Memory: [0, 0, 0 … 10]
32 -> Stack: [0, 32], Memory: [0, 0, 0 … 10]
SWAP -> Stack: [32, 0], Memory: [0, 0, 0 … 10]
RETURN
The final RETURN outputs the 32 memory bytes starting from 0, or [0, 0, 0 … 10], which represents the number 10.
Next, let’s evaluate the surrounding code.
[“
I omitted the inner code discussed earlier for clarity. The first aspect we observe is the presence of two labels, begincode_0 andendcode_0, alongside the #CODE_BEGIN and #CODE_END protections. These markers indicate the start and finish of the inner code, while the protections are implemented for subsequent phases of the compiler, which recognizes everything between them as needing to be processed like an independent program. Now, let’s examine the initial segments of the code. In this example, ~begincode_0 appears at index 10 and ~endcode_0 at index 24 in the ultimate code.
14 -> Stack: [14]
DUP -> Stack: [14, 14]
MSIZE -> Stack: [14, 14, 0]
SWAP -> Stack: [14, 0, 14]
MSIZE -> Stack: [14, 0, 14, 0]
10 -> Stack: [14, 0, 14, 0, 10]
CALLDATACOPY -> Stack: [14, 0] Memory: [ … ]
RETURN
Observe how the initial segment of the code intelligently arranged the stack to append the inner code into memory positions 0…13, and subsequently return that memory chunk. In the completed compiled code,600e515b525b600a37f26002600035025b525b54602052f2, the inner code is neatly placed to the right of the initializer code that merely returns it. In more intricate contracts, initializers may also fulfill roles such as assigning specific storage slots to values, or even invoking or creating additional contracts.
Now, let’s unveil the latest and most entertaining capability of Serpent: imports. A common scenario in the realm of contracts is the necessity to enable a contract to spawn new contracts. The dilemma arises on how to embed the code for the spawned contracts within the spawner contracts? Previously, the only method involved the cumbersome strategy of compiling the newer contracts first and then placing the compiled code into an array. Now, a more efficient solution exists: import.
Insert the following code into returnten.se:
x = create(tx.gas – 100, 0, import(mul2.se))
return(msg(x,0,tx.gas-100,[5],1))
Next, insert this code into mul2.se:
return(msg.data[0]*2)
Now, when you compile returnten.se using serpent and execute the contract, you will find that, ahem, it efficiently returns ten. The rationale for this is clear. The returnten.se contract generates an instance of the mul2.se contract and invokes it using the value 5. As the name implies, mul2.se is a doubling contract, hence it returns 5*2 = 10. Keep in mind that import is not a standard function per se; x = import(‘123.se’) will result in a failure, and import only operates within the very specific context of create.
Now, let’s say you are developing a colossal 1000-line contract and wish to partition it into files. To accomplish this, we utilize inset. In outer.se, insert:
if msg.data[0] == 1:
inset(inner.se)
And in inner.se, input:
return(3)
When you execute serpent compile outer.se, you receive a neat piece of compiled code that returns 3 when the msg.data[0] argument is equal to one. And that’s the entire process.
Future enhancements to Serpent will encompass:
- A refinement of this mechanism to prevent double-loading the inner code if you attempt to use import multiple times with the same filename
- String constants
- Improvements in space and code efficiency for array literals
- A debugging decorator (i.e., a compiling function which indicates which lines of Serpent correspond to which bytes of compiled code)
In the immediate future, however, my efforts will concentrate on bug fixes, a cross-client testing suite, and ongoing work on ethereumjs-lib.