In my previous post, I discussed the high level architecture of the Endgame protocol, a non-custodial NFT rentals marketplace. Since Endgame adheres to the Default Framework specification, the protocol's business logic is separated from its contract storage. By doing so, we afford the protocol a few key benefits, such as reducing nested contract dependence and providing a granular system for state-modifying code access.
However, this design choice comes with a key drawback that is not found in traditional monolithic contract designs: increased gas overhead due to an additional CALL opcode when making updates to storage.
In this post, i'll discuss how I found a gas-saving technique elsewhere in the protocol by accumulating elements in an array of structs with an unknown size. Let's dive in.
The Problem
During the process of creating a rental, an OrderParameters
struct is received from Seaport via the Create Policy and must be persisted in contract storage. This is so that the rental order can eventually be stopped. While the Create Policy processes this data, it must be able to know which items in a Seaport order are considered rentals and which are considered payments.
To demonstrate, I'll first show the OrderParameters
struct in a Seaport order:
struct OrderParameters {
address offerer;
address zone;
OfferItem[] offer;
ConsiderationItem[] consideration;
OrderType orderType;
uint256 startTime;
uint256 endTime;
bytes32 zoneHash;
uint256 salt;
bytes32 conduitKey;
uint256 totalOriginalConsiderationItems;
}
An issue arises here because there exists an OfferItem
array and a ConsiderationItem
array which must be filtered into two separate arrays: one consisting of ERC721/ERC1155 rental items and one consisting of ERC20 payment items.
If we were using any other language, this would be easy. A single array could be filtered into two smaller arrays. Solidity, however, does not yet have support for dynamic memory arrays. So its not possible to use a struct array without first knowing how large the array will become.
To solve for this, I implemented the Accumulator Package which can dynamically allocate structs in-memory using a custom intermediate array encoding. Then, when needed, it can be converted back into the standard Solidity array type since now we know the complete size of the data.
Accumulating Rentals
As I stated before, the processing of creating a rental involves converting the OfferItem
and ConsiderationItem
arrays into buckets of rentals and payments. Before we get to this step, all the items in the Seaport order are first coalesced into one single Item
array.
This process is done via _convertToItems
in the Create Policy. You can view the source code here.
This conversion was used to make the data types easier to work with by providing a more general implementation that encapsulates both OfferItem
and ConsiderationItem
structs.
Here's how it looks:
struct Item {
ItemType itemType;
SettleTo settleTo;
address token;
uint256 amount;
uint256 identifier;
}
With an Item
array fully built up, the next step is to pick out the rental items (ERC721 and ERC1155 tokens) and build up a new array which can be sent over to the Storage Module all in one batch.
To do this, the protocol invokes the Accumulator Package's _insert
function:
// Create an accumulator which will hold all of the rental asset updates,
// consisting of IDs and the rented amount. From this point on, new memory
// cannot be safely allocated until the accumulator no longer needs to include elements.
bytes memory rentalAssetUpdates = new bytes(0);
// Check if each item is a rental. If so, then generate the rental asset update.
// Memory will become safe again after this block.
for (uint256 i; i < items.length; ++i) {
if (items[i].isRental()) {
// Insert the rental asset update into the dynamic array.
_insert(
rentalAssetUpdates,
items[i].toRentalId(payload.fulfillment.recipient),
items[i].amount
);
}
}
// ... arbitrary logic executes here
// Add rentals to storage before any other processing
STORE.addRentals(orderHash, _convertToStatic(rentalAssetUpdates));
We start by instantiating a strip of bytes in memory called rentalAssetUpdates
. This will contain an append-only encoding of the rental assets.
Since the accumulator requires a custom memory encoding, it will be manually altering the Solidity free memory pointer. Because of this, no new memory can be allocated automatically by Solidity until the accumulator has finished writing to memory.
Once the accumulator has completed writing to memory, the Solidity free memory pointer can be incremented automatically as usual.
Finally, the rentalAssetUpdate
bytes variables is directly converted into a standard Solidity struct array so that it can be sent to the Storage Module.
This is a lot to unpack, so let's first turn to how the _insert
function is implemented.
Inserting a New Rental in Memory
At a high level, the _insert
function will take a bytes
value (the accumulation) and continually append elements to it until it is done accumulating data. This can be considered the intermediate representation of a Solidity RentalAssetUpdate[]
array.
Each time _insert
is called, it will append a new series of bytes
data to the current accumulator in a specific encoding. The definition of the accumulator encoding can be summed up as follows:
[0x00:0x20]: Length of the intermediate representation bytes data, in bytes
[0x20:0x40]: Number of `RentalAssetUpdate` elements stored
[0x40:0x60]: `rentalId` of the first element
[0x60:0x80]: `amount` of the first element
[0x80:0xa0]: `rentalId` of the second element
[0xa0:0xc0]: `amount` of the second element
[0xc0:0xe0]: ...
We can start with the definition for _insert
. It accepts the bytes
accumulator which contains all the elements that been accumulated so far in the encoding mentioned above. It also accepts a bytes32
and uint256
which make up a RentalAssetUpdate
struct.
function _insert(
bytes memory rentalAssets,
bytes32 rentalId,
uint256 rentalAssetAmount
) internal pure {
// ... logic goes here
}
As an initial check, the function decides if rentalAssets
has no length. If it's empty, then it is given an initial length of 32 bytes along with a single zeroed-out 32 byte element which is added to the rentalAssets
value.
This code block will only execute when rentalAssets
is empty so that it can be prepared to accumulate elements.
assembly {
// This is the first time inserting into this bytes data.
if eq(mload(rentalAssets), 0) {
// Create some space for the initial element length word.
mstore(rentalAssets, 0x20)
// Zero out the number of elements.
mstore(add(0x20, rentalAssets), 0x00)
}
}
At this point, the memory encoding of the rentalAssets
would look like this:
[0x00:0x20]: 0x0000000000000000000000000000000000000000000000000000000000000020
[0x20:0x40]: 0x0000000000000000000000000000000000000000000000000000000000000000
Next, we can turn to the logic that will execute each time _insert
is called.
A RentalAssetUpdate
struct has a total length of 64 bytes (or 0x40
). So, on a call to _insert
, the total length of the rentalAssets
bytes data must increase by 0x40
, along with incrementing the total number of elements being held within rentalAssets
.
// ... logic previously
// Calculate the new size of the bytes data by adding
// the size of a `RentalAssetUpdate` struct.
let newByteDataSize := add(mload(rentalAssets), 0x40)
// Get the pointer for where the element data begins.
let rentalAssetElementPtr := add(rentalAssets, 0x20)
// Increase the number of rental elements by one.
let elements := add(mload(rentalAssetElementPtr), 1)
Once we know how many elements will now be contained within rentalAssets
, we need to calculate the exact offset for the new RentalAssetUpdate
struct. By multiplying the total number of elements contained in the bytes
data by 0x40
, we can calculate where to place the new element:
// Calculate the position for the new rental ID.
// To do this, calculate the total length of the element portion, then
// subtract by the initial offset. In this case, the offset is the 32-byte
// word (0x20) which contains the length of the array.
let newItemPosition := add(
rentalAssetElementPtr,
sub(mul(elements, 0x40), 0x20)
)
With all calculations completed, all that needs to be done is to store the newly calculated values into memory:
// Store the new byte data size
mstore(rentalAssets, newByteDataSize)
// Store the new number of elements
mstore(rentalAssetElementPtr, elements)
// Store the rental ID
mstore(newItemPosition, _rentalId)
// Store the amount in the adjacent 32-byte word
mstore(add(newItemPosition, 0x20), rentalAssetAmount)
Finally, update the Solidity free memory pointer manually so that once we are done accumulating, we can hand memory management back to Solidity automatically:
// Update the free memory pointer so that memory is safe
// once we stop doing dynamic memory array inserts
mstore(0x40, add(newItemPosition, 0x40))
And that's it! There's nothing to return since all updates were made directly in memory. The next few sections will focus on how this intermediate encoding can be used to get the data back into an array encoding that Solidity can naturally parse.
A (Not So) Quick Aside: Solidity Struct Arrays
Recall the function call to the Storage Module that contains all the rentals that were provided by the Item
array:
// Add rentals to storage before any other processing
STORE.addRentals(orderHash, _convertToStatic(rentalAssetUpdates));
We can now look at how the custom encoding of rentals can be converted back into a standard Solidity array with a type of RentalAssetUpdate[]
. But first, it's worth going over how standard Solidity arrays are stored in memory.
Solidity memory allocation for arrays is not well-documented since it follows a different encoding than what is found in the Contract ABI Specification. So for the most part, we'll be flying on our own here.
Lets first take a look at the default memory layout before any memory has been allocated so that we can get our bearings. We can do this using Chisel, with its !memdump
and !stackdump
commands. By running !memdump
, we can see the initialized memory that all Solidity function calls begin with:
[0x00:0x20]: 0x0000000000000000000000000000000000000000000000000000000000000000
[0x20:0x40]: 0x0000000000000000000000000000000000000000000000000000000000000000
[0x40:0x60]: 0x0000000000000000000000000000000000000000000000000000000000000080
[0x60:0x80]: 0x0000000000000000000000000000000000000000000000000000000000000000
According to the memory layout docs, the slot definitions are as follows:
0x00 - 0x3f
(64 bytes): scratch space for hashing methods0x40 - 0x5f
(32 bytes): currently allocated memory size (aka. free memory pointer)0x60 - 0x7f
(32 bytes): zero slot - this is used as an initial pointer value for memory arrays that have not been initialized with a size yet
We can see how the zero slot is utilized once we declare an empty RentalAssetUpdate
array in Chisel:
RentalAssetUpdate[] memory assets;
Now, if we were to peek at our stack with !stackdump
, we would see this element on top:
[0]: 0x0000000000000000000000000000000000000000000000000000000000000060
[1]: ...
[2]: ...
This indicates that the assets
variable is pointing to address 0x60
in memory, which is the zero slot. To see how this changes, we'll instantiate our array next:
assets = new RentalAssetUpdate[](2);
First, let's take a look at the stack to see if anything is different:
[0]: 0x0000000000000000000000000000000000000000000000000000000000000080
[1]: ...
[2]: ...
Interestingly, the element pointing to the assets
variable has now changed from 0x60
to 0x80
. Let's !memdump
the memory so we can see what's going on:
[0x00:0x20]: 0x0000000000000000000000000000000000000000000000000000000000000000
[0x20:0x40]: 0x0000000000000000000000000000000000000000000000000000000000000000
[0x40:0x60]: 0x0000000000000000000000000000000000000000000000000000000000000160
[0x60:0x80]: 0x0000000000000000000000000000000000000000000000000000000000000000
---------------------------------------------------------------------------------
[0x80:0xa0]: 0x0000000000000000000000000000000000000000000000000000000000000002
[0xa0:0xc0]: 0x00000000000000000000000000000000000000000000000000000000000000e0
[0xc0:0xe0]: 0x0000000000000000000000000000000000000000000000000000000000000120
---------------------------------------------------------------------------------
[0xe0:0x100]: 0x0000000000000000000000000000000000000000000000000000000000000000
[0x100:0x120]: 0x0000000000000000000000000000000000000000000000000000000000000000
[0x120:0x140]: 0x0000000000000000000000000000000000000000000000000000000000000000
[0x140:0x160]: 0x0000000000000000000000000000000000000000000000000000000000000000
We can see that a whole bunch of things have changed. I've added in lines so that its easier to view the clear segments of the memory.
In location 0x40
, you'll notice that the free memory pointer was updated to 0x160
. This intuitively makes sense because we have allocated memory so far up to address 0x160
.
Lets turn to location 0x80
which is where the assets
variable is now pointing to. You'll see that the value stored there is 0x02
. This value defines the length of the array.
After the length definition, you'll see two pointers located at 0xa0
and 0xc0
which contain the values 0xe0
and 0x120
, respectively. These are considered the head
values of the array, and are locations that contain pointers in memory to the actual location of where the data is stored. Recall that each RentalAssetUpdate
contains a bytes32
and a uint256
, totaling 64 bytes. We can do the math here and see that the difference between 0x120
and 0xe0
is 0x40
, or 64 bytes.
Looking at location 0xe0
, we can see that it contains an empty word, followed by another empty word. This is the same case at location 0x120
. These are spots in memory which have been reserved but not allocated to just yet. These are considered the tail
values in the array and contain the actual data of the elements being stored in the array. We'll see how those get populated once we add some elements to the assets
array. Lets do that now:
assets[0] = RentalAssetUpdate(bytes32(uint256(5)), uint256(6));
We can now !memdump
the memory again to see what changed:
[0x00:0x20]: 0x0000000000000000000000000000000000000000000000000000000000000000
[0x20:0x40]: 0x0000000000000000000000000000000000000000000000000000000000000000
[0x40:0x60]: 0x00000000000000000000000000000000000000000000000000000000000001a0
[0x60:0x80]: 0x0000000000000000000000000000000000000000000000000000000000000000
---------------------------------------------------------------------------------
[0x80:0xa0]: 0x0000000000000000000000000000000000000000000000000000000000000002
[0xa0:0xc0]: 0x0000000000000000000000000000000000000000000000000000000000000160
[0xc0:0xe0]: 0x0000000000000000000000000000000000000000000000000000000000000120
---------------------------------------------------------------------------------
[0xe0:0x100]: 0x0000000000000000000000000000000000000000000000000000000000000000
[0x100:0x120]: 0x0000000000000000000000000000000000000000000000000000000000000000
[0x120:0x140]: 0x0000000000000000000000000000000000000000000000000000000000000000
[0x140:0x160]: 0x0000000000000000000000000000000000000000000000000000000000000000
---------------------------------------------------------------------------------
[0x160:0x180]: 0x0000000000000000000000000000000000000000000000000000000000000005
[0x180:0x1a0]: 0x0000000000000000000000000000000000000000000000000000000000000006
This is a bit unexpected. The zero values located at memory regions 0xe0
and 0x120
were not updated. They were completely left alone. Instead a RentalAssetUpdate
element was added starting at location 0x160
. We can verify that this is indeed what happened because we can look to the head of the array and see that in location 0xa0
, it switched from holding a value of 0xe0
to now have a value of 0x160
. It's also worth noting that since the second element in the array is still unallocated, location 0xc0
remains pointing to the zero value located at 0x120
.
Let's try adding one more element to the array to see if a pattern emerges:
assets[1] = RentalAssetUpdate(bytes32(uint256(7)), uint256(8));
Now, dumping the memory we get:
[0x00:0x20]: 0x0000000000000000000000000000000000000000000000000000000000000000
[0x20:0x40]: 0x0000000000000000000000000000000000000000000000000000000000000000
[0x40:0x60]: 0x00000000000000000000000000000000000000000000000000000000000001e0
[0x60:0x80]: 0x0000000000000000000000000000000000000000000000000000000000000000
---------------------------------------------------------------------------------
[0x80:0xa0]: 0x0000000000000000000000000000000000000000000000000000000000000002
[0xa0:0xc0]: 0x0000000000000000000000000000000000000000000000000000000000000160
[0xc0:0xe0]: 0x00000000000000000000000000000000000000000000000000000000000001a0
---------------------------------------------------------------------------------
[0xe0:0x100]: 0x0000000000000000000000000000000000000000000000000000000000000000
[0x100:0x120]: 0x0000000000000000000000000000000000000000000000000000000000000000
[0x120:0x140]: 0x0000000000000000000000000000000000000000000000000000000000000000
[0x140:0x160]: 0x0000000000000000000000000000000000000000000000000000000000000000
---------------------------------------------------------------------------------
[0x160:0x180]: 0x0000000000000000000000000000000000000000000000000000000000000005
[0x180:0x1a0]: 0x0000000000000000000000000000000000000000000000000000000000000006
[0x1a0:0x1c0]: 0x0000000000000000000000000000000000000000000000000000000000000007
[0x1c0:0x1e0]: 0x0000000000000000000000000000000000000000000000000000000000000008
Interesting. This time it allocated the new element exactly where it should have gone, which is directly after element 0 in the array. Its also worth mentioning that the head
pointer for element 1, located at 0xc0
, is now pointing away from the zero value stored at 0x120
, and toward the actual data that we stored, which starts at 0x1a0
.
So what are these 4 consecutive zero words doing in memory from 0xe0
to 0x140
? These regions seem to be used as scratch space for each element for some operations in Solidity. The most clarity that be found is located in the Layout in Memory section of the Solidity documentation that reads:
There are some operations in Solidity that need a temporary memory area larger than 64 bytes and therefore will not fit into the scratch space. They will be placed where the free memory points to, but given their short lifetime, the pointer is not updated. The memory may or may not be zeroed out. Because of this, one should not expect the free memory to point to zeroed out memory.
Because of situations like this where definitions of the Solidity memory layout are a bit vague, I felt that it was safer to design an intermediate encoding of an array and then manually convert it back to a Solidity array rather than attempting to mimic Solidity's process of memory management.
We can now turn to how I implemented this conversion from the intermediate encoding into a standard Solidity memory array.
Getting Out of the Intermediate Representation
Now armed with the knowledge of how Solidity Struct arrays are encoded, it should be a breeze to cover how we convert the custom intermediate representation back into a Solidity array.
We'll start with the definition of _convertToStatic
. It accepts only the intermediate encoded rentalAssetUpdates
parameter and returns a native Solidity RentalAssetUpdate[]
array.
function _convertToStatic(
bytes memory rentalAssetUpdates
) internal pure returns (RentalAssetUpdate[] memory updates) {
// ... logic continues on
}
For this implementation, there is a hybrid Solidity/Yul approach to extract the values. The return values are defined in solidity, but the extraction of the elements are defined using Yul.
The conversion begins by getting pointers to the start of the encoded elements and the total number of elements in the encoding. Then, the Solidity RentalAssetUpdate[]
array can be instantiated with the exact number of elements specified in the encoding:
// Pointer to the rental asset update data.
bytes32 rentalAssetUpdatePointer;
// Load the length of the rental asset update items.
uint256 rentalAssetUpdateLength;
assembly {
// Get a pointer to the number of elements in the bytes data.
// With the 0x20 offset, we would be loading the length of the entire
// byte string, but we want the element length which starts one
// word to the right.
rentalAssetUpdatePointer := add(0x20, rentalAssetUpdates)
// Load the number of elements.
rentalAssetUpdateLength := mload(rentalAssetUpdatePointer)
}
// Instantiate the update array.
updates = new RentalAssetUpdate[](rentalAssetUpdateLength);
From here, each element is looped through by calculating its offset in memory since we know that each RentalAssetUpdate
struct is exactly 0x40
bytes long.
// Iterate through each item in the byte data, and add it as
// an entry to the array.
for (uint256 i = 0; i < rentalAssetUpdateLength; ++i) {
// Define the placeholders.
RentalId rentalId;
uint256 amount;
// Extract the current element from the byte data.
assembly {
// Determine element offset by multiplying the length of a
// RentalAssetUpdate struct (0x40) by the current index, then
// add a word to make sure the next word is accessed because the
// offset defaults to being set to the length pointer.
let currentElementOffset := add(0x20, mul(i, 0x40))
// Load the rental ID starting at the data pointer.
rentalId := mload(add(rentalAssetUpdatePointer, currentElementOffset))
// Load the amount at the data pointer adjacent to it.
amount := mload(
add(0x20, add(rentalAssetUpdatePointer, currentElementOffset))
)
}
// Set the items
updates[i] = RentalAssetUpdate(rentalId, amount);
}
Once the loop completes, the updates
array can be returned and used as normal in other areas of the Solidity code.
Conclusion
At its core, the Accumulator is a technique that takes a number of elements in Solidity and can append them to an array of unknown sizing. This package fills a wide gap in the Solidity language since there still doesn't exist a way to work with dynamically allocated arrays in memory.
I would not have been able to create this implementation without a heavy reliance on one of my favorite tools, Chisel. I encourage you to use Chisel with the examples in this post to see how Solidity allocates array memory for yourself.
You can view the implementation of the Accumulator Package here, and if you have any further questions on how the accumulator works, feel free to reach out to myself at alec@021.gg or on x.com.