From Abstract to Accessible: Ethereum's Journey Towards Programmable Accounts

A Clear Guide to Ethereum's Account Abstraction

Introduction

Unless you've been living under a rock (who knows, maybe even Patrick is excited about it), you've probably heard about account abstraction and ERC-4337, or at least noticed the excitement surrounding it. With ERC-4337, Vitalik's original vision for accounts can finally be realized. He initially aimed to build a system with native account abstraction, but after 1.5 years of initial development, he was forced to ship Ethereum without it due to community pressure. Now, with ERC-4337 live, we can fully appreciate the benefits of having our accounts as smart contracts, a significant shift from what they've been since Ethereum's inception.

For the remainder of this piece, I'll refer to smart contract accounts simply as smart accounts. These smart accounts allow you to program your desired verification rules as well as execution calls. This functionality means you can sign transactions with biometric methods like Face ID, encode multiple contract calls in a single action (such as approve and swap), enable social recovery methods, and facilitate pull payments like authorizing Netflix to withdraw 10 USDC a month. You can also set specific rules for your keys, such as allowing someone to trade a certain amount of tokens on your behalf without the ability to move NFTs or USDC, among other functionalities.

However, to truly grasp the workings and implications of account abstraction, it's essential first to understand how Ethereum accounts and transactions functioned before the rise of ERC-4337. And that's where our journey begins.

Ethereum Initially

In Ethereum, an account represents a participant with a unique address, capable of holding ETH. There are two types of accounts:

  • Externally Owned Account (EOA): Controlled by individuals or entities who possess the corresponding private key. EOAs are capable of initiating transactions directly, such as transferring ETH, interacting with smart contracts, or deploying new contracts.

  • Contract Account: Automated agents, known as smart contracts, that are deployed on the Ethereum network. They are governed by their embedded code and do not possess private keys. Unlike EOAs, Contract Accounts cannot initiate transactions on their own. They require an external trigger, a transaction initiated by an EOA or another contract, to perform actions defined in their code.

EOA Overview

Transactions must originate from an EOA and they are directly controlled through its private key. Possession of this key allows an individual (or entity) to sign transactions, effectively authorizing actions such as transferring ETH or interacting with smart contracts. This signature is the digital proof that the transaction has been authorized by the holder of the private key associated with the account address from which the transaction is sent. In these transactions, the corresponding public address, which is derived from the private key, is used as the from field in the transaction object. When an Ethereum full node receives a transaction from its peer, it needs to validate the transaction's signature. They are able to take the transaction and its signature and output the address that signed that transaction. If the outputted address does not match the from address, the transaction is immediately rejected, effectively blocking unauthorized transactions on the network. The from address is also the public address that serves as the account's visible identifier on the blockchain, enabling other participants to send funds to it. Anyone can create a transaction with the from address being any address. For instance, I could create a transaction where the from address is Vitalik's address. However, if you can't sign the transaction using the corresponding private key, the transaction will immediately be rejected. Therefore, ownership of an EOA's private key means complete control over the account and its funds. 

“Creating” An EOA

Creating an EOA on the Ethereum network is a bit of a misnomer. Rather than creating a new account in the traditional sense, generating a new EOA is more about gaining access to a specific address within Ethereum's vast, pre-existing address space. This process is straightforward and free of charge. Your Ethereum node can easily set up a new account upon request. The first step is generating a private key, a unique, random, 256-bit number. The chances of duplicating a private key are virtually nonexistent, equal to finding a specific atom within the observable universe. Here’s the enormity of that number for perspective: 115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,936. Even with the entire global population generating a billion keys every second, the probability of a repeat within our universe's lifespan is minuscule.

Using cryptographic algorithms, a public key is derived from the private key. Ethereum, like Bitcoin, uses the Elliptic Curve Digital Signature Algorithm (ECDSA) with the secp256k1 curve. The process is a one way street: producing a public key from a private one is easy and cheap, whereas the reverse is nearly impossible with today's technology (aka until quantum computing arrives). The public key is a 512-bit figure, formed by two 256-bit coordinates on the elliptic curve. This deterministic process ensures the same private key always results in the same public key. The Ethereum account address is then created using the Keccak-256 hashing algorithm, which processes the public key to produce a 256-bit hash. The unique Ethereum address is formed from the last 160 bits (20 bytes) of this hash. In Ethereum's design, all possible addresses theoretically exist within its vast address space. It's possible to send ETH to an address for which no private key has yet been generated. Remarkably, if a private key for that address is later created, the individual who generated it would gain access to the sent assets.

For those interested in a deeper dive of the cryptography involved, Elliptic Curve Cryptography Overview and The Discrete Logarithm Problem videos provide an excellent breakdown of how the cryptography works and why it remains secure.

Contract Account Overview

Contract accounts, commonly known as smart contracts, differ significantly from EOAs. While EOAs are controlled by individuals through private keys, contract accounts are governed by predetermined code and lack private keys. Contract accounts are created and deployed on Ethereum through a transaction initiated by an EOA. The deployed code, written in a programming language such as Solidity, is then compiled and executed by the EVM.

Every time a contract account receives a transaction, either from an EOA or another contract, its code is executed by the EVM. These accounts maintain an internal state, a record of data like balances and ownership details, which can be altered according to the rules set in the contract's code. For instance, a DeFi application or a token contract like ERC-20 are examples of contract accounts, each with its unique set of rules and functionalities.

The embedded code in contract accounts often includes validation rules that define the conditions for successful transactions or function executions. If a transaction or function call does not meet these predefined rules, the contract will automatically revert the transaction. For example, a smart contract might have a rule that only allows the contract owner to withdraw funds. If someone else tries to invoke the withdrawal function, the contract will check against its ownership condition, fail the validation, and the transaction will revert, ensuring the assets remain secure.

Creating a Contract Account

Creating (deploying) a contract account from an EOA costs a gas fee, reflecting the computational resources required to update the Ethereum network with the new contract's code and state. There are two different ways to create a contract account. The first, and standard method, is by using the CREATE opcode. It's executed in Solidity using the syntax new ContractName(). The contract's address generated through this method is determined by two factors: the creator's address and their transaction count, also known as nonce. However, the exact address of the contract remains unknown until the transaction is confirmed, due to the nonce changing with each transaction the creator makes.

The process for determining the contract address using the CREATE opcode can be illustrated through the following pseudo-code:

rlpEncoded = RLP(senderAddress, nonce); 
hash = keccak256(rlpEncoded); 
contractAddress = '0x' + last20Bytes(hash);

* RLP stands for Recursive Length Prefix, a serialization method used in Ethereum.

The other method is using the CREATE2 opcode to allow for more predictable contract address generation. This method can be executed in Solidity using new ContractName{salt: salt}(), where salt is a parameter defined by the user. The contract address in this case is derived from a combination of the sender's address, the salt, and the contract's initialization code.

The pseudo-code for the CREATE2 opcode looks something like this:

hash = keccak256(0xff ++ senderAddress ++ salt ++ keccak256(init_code)); 
contractAddress = '0x' + last20Bytes(hash);

* init_code refers to the smart contract's bytecode.

Accounts in the EVM

In the EVM, every account is associated with four fields:

  • Nonce: For EOAs, this counts the number of transactions sent. For contract accounts, it represents the number of contracts created. The nonce prevents replay attacks by ensuring each transaction is unique.

  • Balance: The amount of Ether in the account, measured in wei. There are 1e+18 wei in one ETH, with wei being the smallest denomination of Ether.

Fields Specific to Contract Accounts:

  • CodeHash: The immutable hash of the EVM code that defines a contract account's functions and behavior. In EOAs, this is a hash of an empty string, reflecting the absence of executable code.

  • StorageRoot: Known as the storage hash, it's derived from the root node of a Merkle Patricia trie, a data structure for mapping keys to values. This is used by contract accounts for data storage. EOAs, which do not store data in this manner, have a constant value representing an empty trie.

Transactions Overview

Transactions are actions that change the Ethereum's network state. All Ethereum nodes agree on this state and share transactions that change it. When a block with new transactions is confirmed, it updates the network's agreed upon state.

Ethereum Transaction Structure:

  • From: The sender's Ethereum address (EOA).

  • To (Recipient): The receiving address. ETH is transferred to EOAs, or it triggers actions in contract accounts.

  • Value: Amount of ETH being transferred, in wei.

  • Gas Limit: The highest gas amount the transaction can use.

  • MaxPriorityFeePerGas: The extra fee given to miners or validators.

  • MaxFeePerGas: The maximum fee per gas the sender is willing to pay, inclusive of the base fee and priority fee.

  • Nonce: A sequential number tied to the sender's address, indicating the number of transactions sent.

  • Input Data: Used for calling contract functions, sending info, or deploying contracts.

  • Signature (v,r,s): Added after the transaction is signed, confirming its authenticity.

A sample transaction for sending 0.01 ETH from one address to another would look like this:

{ 
  "from": "0xEA674fdDe714fd979de3EdF0F56AA9716B898ec8", 
  "to": "0xac03bb73b6a9e108530aff4df5077c2b3d481e5a", 
  "gasLimit": "21000", 
  "maxFeePerGas": "300", 
  "maxPriorityFeePerGas": "10", 
  "nonce": "0", 
  "value": "10000000000000000", 
  "inputData": "", 
  "v": "0x1b", 
  "r": "0xb31c...1f", 
  "s": "0xc1b9...4e" 
}

The signature components (v, r, s) confirm the sender authorized the transaction. Once the signed transaction is shared with the network and validated by nodes, it can be added to the blockchain. The next section will detail these validation checks.

Life Cycle of A Transaction

Simple ETH Transfer

The journey of a simple ETH transaction begins with the creation of a transaction object by your wallet software.

An example transaction object for sending 1 ETH could look like this:

{
  "from": "0xEA674fdDe714fd979de3EdF0F56AA9716B898ec8",
  "to": "0xac03bb73b6a9e108530aff4df5077c2b3d481e5a",
  "gasLimit": "21000",
  "maxFeePerGas": "300",
  "maxPriorityFeePerGas": "10",
  "nonce": "0",
  "value": "1000000000000000000",
  "inputData": "", 
}

The next step is signing the transaction. To sign it you hash the transaction object and then sign this hash with the sender's private key using the Elliptic Curve Digital Signature Algorithm (ECDSA) on the secp256k1 curve. The resulting signature, comprising 'v', 'r', and 's' values, is appended to the transaction, forming the signed transaction. Once signed, the Ethereum wallet or interface prepares the transaction for broadcast on the Ethereum network. This typically involves serializing the transaction into a raw hexadecimal format and then sending it to an Ethereum full node, often through an Ethereum client like Geth or a service like Infura.

Ethereum nodes, upon receiving a transaction, perform a series of checks to verify its authenticity and viability. These checks include signature verification against the sender's address, matching the nonce with the sender's account nonce, ensuring the gas limit is appropriate, confirming the sender's balance can cover the transaction value and gas cost, verifying the gas price, and checking that the transaction format adheres to Ethereum's standards. If a transaction passes these checks, the node adds it to its local mempool, a holding area for transactions before they are included in a block. The node then propagates the transaction to other nodes in the network.

Finally, once a transaction is included in a validated block and the block is accepted by the network, the transaction achieves confirmation and finality. The Ethereum blockchain updates to reflect the state change resulting from the transaction, modifying the involved parties' ETH balances accordingly.

The key thing I want you to take away from this lifecycle is that executing an ETH transaction requires proving ownership of the sender's address through the private key. This proof is validated by nodes offchain via signature verification. Once validated and processed, the transaction alters the blockchain's state, updating the ETH balances of those involved.

Smart Contract Interactions

Interacting with a smart contract is different from basic ETH transfers in a number of ways. One key difference is the utilization of the inputData field within the transaction object. This field is used to encode both the function call and any necessary arguments required by the function being called. The first four bytes of this inputData field are dedicated to the function selector. This selector is the hash of the function's signature (for example, transfer(address,uint256)), and it determines which function within the contract will be called. Following the function selector, the arguments to the function are provided in an ABI-encoded format, each of these arguments is padded to 32 bytes.

Here’s an example of a transaction object calling a smart contract function:

{ 
  "from": "0xEA674fdDe714fd979de3EdF0F56AA9716B898ec8", 
  "to": "0xContractAddressHere", 
  "gasLimit": "HigherThan21000", 
  "maxFeePerGas": "AppropriateFee", 
  "maxPriorityFeePerGas": "AppropriateTip", 
  "nonce": "TransactionCount", 
  "value": "AmountOfETHToSend (can be 0)", 
  "inputData": "FunctionCallData",
  "v": "0x1b", 
  "r": "0xb31c...1f", 
  "s": "0xc1b9...4e" 
}

When a transaction is sent to a smart contract, Ethereum nodes validate the signature, ensuring ownership of the sender's address through the private key, and then they run the transaction through the EVM to execute it. The smart contract's code determines whether the transaction is valid. If any of the contract's conditions are not met, the EVM reverts all state updates made during the transaction's execution. However, the sender still incurs the gas costs associated with processing the transaction, reflecting the computational resources used up until the point of failure.

Smart Contract Example - Multi-Signature Wallet:

In this multi-signature wallet contract, the confirmTransaction function includes several checks:

  • onlyOwner: Ensures only wallet owners can call the function.

  • txExists: Verifies the existence of the transaction.

  • notExecuted: Prevents double execution of a transaction.

  • notConfirmed: Stops an owner from confirming the same transaction multiple times.

Successful completion of these checks leads to a state update in the contract, reflecting the confirmation. If the required number of confirmations is reached, the transaction executes. 

In summary, both simple ETH transfers and smart contract interactions require proof of ownership via the private key associated with the from address. Transactions on the Ethereum network are authenticated this way, it's a non-customizable process. However, smart contracts can be designed to include onchain verifications as determined by the contract's code.

Limitations with the Initial Account and Transaction System

The initial account and transaction system faced limitations. One being that this setup restricted transaction verification to using ECDSA on the secp256k1 curve, which only checks the transaction's from address against the signature. While it's possible to add verification rules into smart contracts, ideally, accounts themselves would have this same customizability. Another limitation was that a single transaction could only call one smart contract and execute just one of its functions, rather than allowing multiple actions to be encoded in one transaction. Furthermore, there was no built in system for an address to cover the gas fees for another user's transaction, either for free or in return for a different token.

To begin addressing these issues, Ethereum needed to transition toward a system where contract accounts can initiate transactions. Contract accounts in this system would need two important functions: one to verify that the person sending the transaction is indeed authorized to use the account, and another to execute the transaction's intended actions. This change primarily solves the programmable account problem, allowing for the integration of various verification rules within the accounts. As an example, just as the developer implemented predefined checks in the Multi-Signature contract discussed in the previous section, users can now encode similar rules for their own accounts. For instance, users could create contract accounts with rules to prevent transactions over a certain amount within a specific time frame. If such a transaction is attempted, network nodes would reject it. Contract accounts could also be set to perform specific actions after each transaction, like automatically donating to Gitcoin. Additionally, these accounts would enable native multi-signature features and social recovery options, improving security and user experience. However, even further modifications to the EVM would be needed to fully address the challenges of executing multiple actions in a single transaction and enabling one user to pay for another's transaction fees.

ERC-4337

How to Fix the System

There have been a number of EIPs that have been proposed to guide this process. Vitalik gave a detailed overview of the history and development of account abstraction in Ethereum, including the early ideas, various EIPs that were considered but not implemented, and where we stand today, in his presentation at ETH CC. For those interested in a deeper understanding of this journey, you can watch his presentation here.

This transition towards account abstraction is complex and requires thorough testing to ensure the network's stability. Considering Ethereum's crucial role in finance, art, and other sectors, any significant disruptions could have wide reaching impacts. The Ethereum Core Developers had been focusing on the successful implementation of the Merge, transitioning Ethereum to a proof of stake. Their current areas of focus includes scaling solutions like Proto-Danksharding and Danksharding, and other initiatives such as stateless clients and Verkle trees. The core devs are already operating at full capacity and face challenges in managing the additional workload required for implementing account abstraction.

With that in mind, the Ethereum community introduced ERC-4337, a standard that is already operational on the network. This standard enables the benefits of account abstraction immediately, without requiring modifications to the EVM. It does so by introducing a "user intent layer" that works upstream of the current transaction process. This layer, coupled with a new set of standards to follow and a new entry point smart contract (I'll explain this contract in detail later), brings the benefits of account abstraction to Ethereum without modifying the underlying protocol.

Layer 2 solutions are actively exploring account abstraction as well. For example, zkSync has already integrated account abstraction into their protocol. Ethereum's modular roadmap allows for Layer 2s to experiment with modifying the EVM without permission from the base layer. Additionally, other proposals suggest incorporating a different account abstraction architecture into various Layer 2 solutions. As these solutions mature and demonstrate their security and effectiveness, they present a model that could be adopted by Ethereum Mainnet in the future. Meanwhile, ERC-4337 serves as a crucial intermediary, providing the immediate benefits of account abstraction.

ERC-4337 Components

Although the EVM remains unchanged, ERC-4337 adds new elements to Ethereum, changing how transactions are created and handled. Let's break down these components:

  • User Operations (UserOps): Similar to transactions, UserOps are objects that represent what a user wants to do. They're structured with fields like sender, nonce (for anti-replay and salt in first-time account creation), initCode (for deploying the smart account if it doesn't exist), callData (the execution data), gas limits for different phases, fees, paymaster data (for gas fee sponsorship), and the signature.

  • Bundlers: They are nodes that collect, simulate, bundle, and submit UserOps onchain. Currently, UserOps are sent directly to bundlers, but future plans include gathering them from a specialized mempool. Bundlers simulate these operations to make sure they will be valid onchain, bundle, and send them as a single transaction to the EntryPoint Contract.

  • EntryPoint Contract: This contract handles the bundled UserOps. It verifies each UserOp, ensuring the smart accounts involved meet the required conditions. Successful verification leads to the execution phase. The EntryPoint Contract also manages gas use and compensates bundlers, as they cover the initial gas costs from their accounts.

  • Account Contracts (Smart Accounts): These are user owned smart contract wallets. Wallet developers must implement functions in these contracts for verification and processing transactions.

  • Factory Contract: Used in setting up new wallets. It uses the initCode from a UserOp to create the wallet, using the CREATE2 opcode for predictable addresses.

  • Paymaster Contracts: Optional contracts that can sponsor transaction fees for Account Contracts, allowing fees to be paid in ERC-20 tokens or for free rather than ETH.

  • Aggregator Contract: Designed to lower Layer 2 costs by verifying multiple UserOps with a single BLS signature, thus reducing data size. No aggregator contracts are in production yet.

Architecture Overview

The EntryPoint contract is central to ERC-4337. There's a single, official version of this contract used across all EVM chains. If there needs to be any changes to it, a new contract has to be deployed. The current EntryPoint contract can be found at the address 0x5FF137D4b0FDCD49DcA30c7CF57E578a026d2789. While the EntryPoint contract remains consistent, other contracts like smart accounts, Paymasters, and Factorys offer flexibility for customization. Developers can tailor these contracts to their specific requirements, provided they follow the guidelines of ERC-4337.

In the next part of the post, we'll explore the lifecycle of a UserOp under ERC-4337, from the initial setup of a new smart account to the detailed process of verifying and executing a UserOp. This walkthrough will cover multiple scenarios including one where gas costs are sponsored by a paymaster. We'll dive into the functions and interactions of key components in this system: the EntryPoint, Smart Account, Paymaster, and Factory.

I recommend pulling up the Ethereum Foundation's repository related to ERC-4337 as you read along. It's available here and features sample contracts like SimpleAccount.sol, EntryPoint.sol, DepositPaymaster.sol, TokenPaymaster.sol, and SimpleAccountFactory.sol. Reviewing these contracts while I walkthrough the examples will really help with your understand of the ERC-4337 framework's mechanisms and interactions.

The examples we're going to cover, utilize sample contracts provided by the Ethereum Foundation and Biconomy, to service as an example of specific instances of how ERC-4337 can be implemented. However, it's important to understand that both smart accounts and paymasters in ERC-4337 offer a high degree of customization. They are not limited to predefined structures or functionalities but can be tailored to meet diverse verification and execution requirements.
In these simple examples, the smart account will have just one owner, whose signing key is an EOA verified using ECDSA with the secp256k1 curve. There will also be no special modules (I'll explain what modules are later), it will solely execute the intended action in the UserOp and nothing more.

Deploying a Smart Account

In order to send a transaction from a smart account in ERC-4337 you actually have to have a smart account deployed. Your smart account wallet software will handle the setup details like potentially offering options like setting account owners and recovery agents. In our example, the smart account will only have one owner. To start the deployment process you'd interact with the wallet interface, where a UserOp is crafted, signed, and sent to a Bundler.

Before accepting a UserOp, bundlers verify its authenticity and viability. They use an RPC method called simulateValidation to simulate calling the EntryPoint contract. This validation ensures that the UserOp's signature is correct and that it can cover its fees. Invalid UserOps are dropped from the Bundler's mempool. Bundlers then aggregate multiple valid UserOps from its mempool and send them as a batch to the EntryPoint’s handleOps function, using their EOA. Since they are using their EOA, they have to pay the gas for this execution. They are later reimbursed for this gas cost, plus receive an additional fee, but it's important to note that they initially front the gas.

Follow the process steps in numerical order

The initCode field of the UserOp is used for encoding deployment information of a new smart account. The first 20 bytes of initCode are the address of the Factory Contract this UserOp wants to use to deploy their smart account. If this field is not empty, the EntryPoint attempts to call out to this Factory Contract to begin the smart account deployment process. This contract's role is to deploy new Smart Accounts, using the CREATE2 opcode for predictable address generation. The Factory Contract stores the bytecode of the smart account as a state variable to assist in deployment. The Factory Contract using the CREATE2 opcode, inputs this bytecode, the owner's signer public address (in our case EOA), and an optional salt (which is the nonce in the UserOp) to deploy a new smart account. To ensure consistency, the wallet interface precomputes the new account's address and includes it in the UserOp's sender field. If the actual and precomputed addresses don't match, the UserOp is reverted. If the deployment was successful the EntryPoint will emit an AccountDeployed event.

Example UserOp for Smart Account Deployment:

{
  "sender": computedAddress, // Address of the new smart account, computed offline
  "nonce": 0, // For a new account, the nonce is an optional salt
  "callData": "", // Empty for deployment 
  "callGasLimit": 1000000, // Estimated gas limit for deployment 
  "verificationGasLimit": 50000, // Gas limit for verification process 
  "preVerificationGas": 21000, // Basic transaction gas cost 
  "maxFeePerGas": 1000000000000, // Current network gas price 
  "maxPriorityFeePerGas": 1e9, // 1 Gwei, example priority fee 
  "paymasterAndData": 0x0…, // Zero address if not using a paymaster 
  signature: "0xEOAOwnerSignature", // Signature from the account owner (offchain signing process) 
  "initCode": deploymentCode // Factory address and constructor arguments for the new account 
}

Bundlers use their EOAs to send UserOps to the EntryPoint and they need to be compensated for the gas used in that transaction and paid an additional fee for their services. This is paid by either the smart account or a sponsoring paymaster. Smart accounts or paymasters deposit ETH in the EntryPoint contract to cover these costs. If there's insufficient deposit ETH during execution, the EntryPoint withdraws the required amount from the respective contract (smart account or paymaster) and then compensates the Bundler. If at any stage, the contract lacks sufficient ETH to cover the fee, the UserOp is reverted.

The fee payment process for deploying a smart account works as follows:

  • Direct ETH Transfer: By using the CREATE2 opcode's deterministic nature users can send ETH directly to this precomputed address before the actual smart account is deployed. Upon deployment, the smart account will already possess ETH, enabling the EntryPoint to withdraw the required amount from the smart account.

  • Paymaster Sponsorship: The gas costs are funded by a paymaster. In this case, the paymaster's address and relevant information are specified in the paymasterAndData field. The paymaster contract has specific rules set up to determine whether it will sponsor gas for a particular smart account.

In our example, since there isn't a sponsoring paymaster involved, it would be necessary to precompute the address of the smart account beforehand and prefund it with some ETH. This prefunding is for covering the costs of the bundler's services. By doing so, when the smart account is deployed using CREATE2, it already has the necessary ETH balance. This allows the EntryPoint to directly use these funds to compensate the bundler for processing the transaction. From the user's perspective, the smart account wallet software will handles this technical process, often providing a straightforward option like a button to easily pre-fund their smart account.

Summary of Key Steps:

  1. Prefunding the Smart Account: Before any actions, the smart account must be precomputed and pre-funded with ETH to cover the costs of the bundler's services.

  2. Forming UserOps: UserOps are formed in the wallet interface to represent the intended actions for deploying the smart account, signed, and sent to Bundlers. 

  3. Validation and Processing by Bundlers: Bundlers validate offchain the UserOps for authenticity and viability using an RPC method, and aggregate the valid UserOps. 

  4. Sending UserOps to EntryPoint: The aggregated UserOps are sent to the EntryPoint contract’s handleOps function by the Bundlers using their EOAs. 

  5. Deploying through Factory Contract: The initCode in UserOps directs the EntryPoint to use the Factory Contract, which use the CREATE2 opcode, to deploy new smart accounts. 

  6. Precomputed Address Consistency Check: The EntryPoint checks the precomputed address in the UserOp against the actual deployed smart account address. A mismatch means the UserOp is reverted. 

  7. Deployment Confirmation Event: The EntryPoint emits an AccountDeployed event upon successful deployment of the smart account. 

  8. Gas Cost Management: Bundlers are reimbursed for their gas costs in processing and sending UserOps to the EntryPoint. This reimbursement is either from the smart account's funds or a paymaster's sponsorship. 

Verifying and Executing a UserOp From Smart Account

With a smart account deployed, let's explore the process of verifying and executing a UserOp from it. As an example, consider a scenario where you want to approve UniSwap to spend 0.01 WETH and then swap that 0.01 WETH for 20 DAI. From a user perspective, if you have 0.01 WETH in your smart account and sufficient ETH deposited in the EntryPoint to cover the gas fees, the process is straightforward. You simply click a button, sign the transaction with the EOA designated as the owner of your smart account in your wallet interface, and then wait for the UserOp to be processed. With just one click, you've both approved and swapped. You'll see your WETH gone and a new DAI balance in your smart account. There's a lot happening behind the scenes, so let's break it down.

Initially, the UserOp is formed by the application's frontend and the user's account abstraction wallet software. This object includes necessary details such as the smart account's address, nonce, calldata, gas limits, and the signature from the smart account's owner.

The UserOp might look like: 

{ 
  "sender": "0xSmartAccountAddress", 
  "nonce": 1, // Current nonce of the smart account 
  "callData": "0xEncodedToCallExecutebatchWithEncodedWETHAddressValueCallDataToCallApproveAndEncodedUniSwapAddressValueCallDataToCallSwap", // Encoded calldata to call the executeBatch function that will execute approve and swap operations 
  "callGasLimit": 500000,
  "verificationGasLimit": 50000,
  "preVerificationGas": 21000,
  "maxFeePerGas": 100000000000,
  "maxPriorityFeePerGas": 2000000000,
  "paymasterAndData": "0x0…", // No paymaster used (zero address) 
  "signature": "0xEOAOwnerSignature" // Signature from the EOA owner of the smart account 
}; 

From a high-level perspective, the callData is a bytes array that the smart account is set up to parse and decode. In our example it contains three items. The first item is to call the executeBatch function on the smart account. The next is to call the ERC20 contract, to approve UniSwap to spend 0.01 WETH. The last item is to execute a swap transaction on UniSwap, where 0.01 WETH is exchanged for DAI. The smart account, upon receiving this callData, first processes the approval transaction with the ERC20 token and then carries out the swap operation on UniSwap, ensuring that both actions are completed in the intended order within a single UserOp. We'll go through how this works in more detail later in this section.

Going back to our example, as before, the bundler uses an RPC method to call the simulateValidation function of the EntryPoint locally before sending the UserOp to the EntryPoint. This step again checks whether the signature on the UserOp is authentic and if the operation has the necessary funds to be executed. Upon passing the initial validation, along with other initial validation UserOps, the bundler sends the bundle to the handleOps function on the EntryPoint.

Verifying and executing a UserOp, begins with looping through the bundle of UserOps ensuring each UserOp has enough funds to cover its operation and passing its specific validation checks.

Initially, the system calculates the necessary prefund amount (an estimate of total amount the user needs to pay the bundler) for the UserOp. An internal function, _validateAccountPrepayment, then assesses whether the smart account associated with the UserOp has sufficient ETH deposited in the EntryPoint to cover the prefund amount. If the deposited ETH meets the required amount, the missingAccountFunds variable is set to zero. Otherwise, the function computes any shortfall and assigns this value to missingAccountFunds.

This step is followed by calling the validateUserOp function on the smart account. This function verifies that the call is made by the EntryPoint and validates the signature using the validateSignature method. In our case validateSignature checks if the signature (UserOp.Signature) originates from the EOA that owns the account. If the signature doesn't match, it returns SIG_VALIDATION_FAILED instead of causing a transaction failure.

The process also includes nonce verification to ensure transaction order and prevent replay attacks. If additional funds are needed to meet the prefund requirement, the function attempts to transfer the necessary ETH from the smart account to the EntryPoint. This step does not cause a revert in the case of insufficient funds in the smart account, as this is addressed subsequently in the EntryPoint.

After these validation checks, the EntryPoint rechecks if the smart account's deposit meets the prefund amount. If the deposit still falls short, despite the attempted top-up, the UserOp is reverted. This process ensures each UserOp is thoroughly vetted for both funding and authenticity before execution.

In the EntryPoint contract's handleOps function, each UserOp is individually processed. In typical smart contract scenarios, a revert would cause the entire transaction to fail and halt all subsequent actions. However, the handleOps function is designed differently for handling UserOps. Here, if a UserOp fails during the validation or execution phase, the contract captures this failure and proceeds to the next operation. This functionality is made possible through Solidity's try and catch blocks, allowing the system to isolate and manage individual failures without impacting other operations.

Once the UserOps have passed the validation phase, they move into the execution phase. This phase involves looping through the array of verified UserOps and calling the EntryPoint’s _executeUserOp function for each one. A critical function within this phase is innerHandleOp, which executes a low-level call to the Smart Account, using the callData provided in the UserOp. The callData dictates which functions to execute and their respective inputs. In our specific scenario, it is encoded to call our executeBatch function with the following inputs:

  • dest: An array of addresses (the WETH and UniSwap addresses)

  • value: an empty array

  • func: an array that calls the approve function to permit UniSwap to spend 0.01 WETH, followed by the swap function to exchange 0.01 WETH for DAI.

The executeBatch function starts by ensuring that the call originates from the EntryPoint or the account owner. It then verifies the alignment of the lengths of the destination addresses (dest), values (value), and function calls and inputs (func). Then the function makes calls to the specified smart contracts. This function appears as follows:

function executeBatch(address[] calldata dest, uint256[] calldata value, bytes[] calldata func) external {
        _requireFromEntryPointOrOwner();
        require(dest.length == func.length && (value.length == 0 || value.length == func.length), "wrong array lengths");
        if (value.length == 0) {
            for (uint256 i = 0; i < dest.length; i++) {
                _call(dest[i], 0, func[i]);
            }
        } else {
            for (uint256 i = 0; i < dest.length; i++) {
                _call(dest[i], value[i], func[i]);
            }
        }
    }

In our example, the smart account executes two specific transactions:

  1. Approving UniSwap to Spend 0.01 WETH on the DAI contract

  2. Swapping 0.01 WETH for 20 DAI on the UniSwap contract

As the EntryPoint contract processes each of these UserOps, it keeps track of the gas consumed for each operation. This tracking includes all UserOps, whether they are executed successfully or end up reverting. The EntryPoint contract also keeps an up to date record of the deposit amounts for each involved smart account and any paymaster contracts. Once all UserOps in the batch are processed, the EntryPoint contract concludes the operation with the _compensate function. This function takes the total gas used for processing the entire batch of UserOps and transfers this amount from the EntryPoint to the Bundler.

Summary of Key Steps:

  1. UserOp Formation: The application's frontend and the user's account abstraction wallet create the UserOp, including essential details like the smart account's address, nonce, calldata, gas limits, and owner's signature. 

  2. Validation by Bundler: Using the RPC method simulateValidation, the bundler locally authenticates the signature and ensure the UserOp is funded. 

  3. Bundler Sends to EntryPoint: After initial validation, the bundler sends the UserOp batch to the EntryPoint's handleOps function. 

  4. Prefund Assessment: The EntryPoint calculates the necessary prefund for the UserOp and checks if the smart account has sufficient deposited ETH. 

  5. Signature and Nonce Verification: The EntryPoint validates the signature and nonce of the UserOp, ensuring authenticity and preventing replay attacks. 

  6. Funding Shortfall Check: If additional funds are needed, the EntryPoint attempts to transfer the required ETH from the smart account. 

  7. UserOp Processing: Each UserOp is individually processed in the EntryPoint, isolating failures and allowing continuous operation through try and catch blocks. 

  8. Execution Phase: The EntryPoint loops through verified UserOps, executing each one using the executeBatch function. 

  9. Specific Transactions Execution: The smart account executes the intended transactions, like approving and swapping. 

  10. Bundler Refunded : Bundlers are reimbursed for their gas costs in processing and sending UserOps to the EntryPoint from the smart account's deposit in the EntryPoint

Using a Paymaster To Pay For Gas

In this scenario, we explore how a paymaster can sponsor the gas fees required for minting an NFT. This allows users to mint NFTs without paying any gas fees. The process from the users standpoint is straightforward: users click a mint button, and the transaction is processed with the gas fees covered by the paymaster.

The journey begins with the user's account abstraction wallet and the application's frontend collaborating to create a UserOp. This UserOp contains essential information such as the smart account's address, the current nonce, and the encoded instructions for the NFT minting operation. Additionally, it includes a field, paymasterAndData, which carries details about the paymaster responsible for the gas fees.

An example UserOp for NFT minting might look like this:

{
  "sender": "0xSmartAccountAddress", 
  "nonce": 2,
  "callData": "0xEncodedToCallExecuteWithEncodedMintNFTContractAddressAndMintFunction", // Encoded calldata to call execute with the NFT address and minting operation
  "callGasLimit": 500000, 
  "verificationGasLimit": 50000,
  "preVerificationGas: 21000,
  "maxFeePerGas": 100000000000,
  "maxPriorityFeePerGas": 2000000000,
  "paymasterAndData": "0xPaymasterAddressEncodedValidUntilValidAfterSignature", // Paymaster info, including address, timestamps, and signature
  "signature": "0xEOAOwnerSignature"
};

The paymasterAndData contains the paymaster's address and additional data like valid timestamps and a signature. This signature, generated offchain by a service trusted by the paymaster, confirms the agreement to sponsor this operation. The valid timestamps in paymasterAndData indicate when the operation is permitted.

The UserOp is sent to a bundler, which verifies the operation's authenticity and the paymaster's commitment to cover the fees. After passing validation, the UserOp is bundled with others and sent to the EntryPoint's handleOps function.

It begins with the _validateAccountPrepayment function to determine if the user's ETH deposit in the EntryPoint or the use of a paymaster is sufficient for the required prepayment amount. Since a paymaster is utilized in this case, no additional transfer of ETH is needed from the user's account. The procedure then advances to the signature and nonce verification phase. This involves the use of the validateUserOp function on the smart account, which verifies the origin of the call from the EntryPoint, confirms the matching of the signature to the smart account’s owner, checks the nonce, and assesses if any extra ETH transfer is necessary. In this scenario, the presence of the paymaster eliminates the need for an additional transfer. 

Next, the handleOps function extracts the paymaster's address from the paymasterAndData field, which comprises the initial 20 bytes. If the paymaster is not set to the zero address, the _validatePaymasterPrepayment function is internally called. This function checks the sufficiency of the Paymaster's ETH deposit for the prepayment amount. If the paymaster's deposit is insufficient, the UserOp is immediately reverted.

After that, the EntryPoint calls the validatePaymasterUserOp function on the paymaster. This function's primary role is to validate the eligibility of a UserOperation for gas fee sponsorship by the paymaster. It begins by parsing the paymasterAndData field of the UserOp to extract validUntil and validAfter timestamps and a signature. A unique hash is created from the contents of the UserOp, including these timestamps. This hash is then used in Ethereum's standard message signing process, and ECDSA recovery is used to ensure that the provided signature matches the paymaster's owner. If the signatures do not match, indicating a lack of authorization by the paymaster, the function returns a SIG_VALIDATION_FAILED error, indicating that the operation cannot proceed under this paymaster's sponsorship. This function returns a context, an empty string in this scenario, which doesn't play a role currently but will in the next scenario, along with the paymaster's details.

Assuming the UserOp passes the paymaster's validation, the handleOps function proceeds to the execution stage. It follows similar steps to the previous example by calling into the smart account, which then interacts with the NFT contract to mint the NFT. The difference in this scenario is that the callData is now encoded to call the smart account's execute function rather than executeBatch, as there's only one operation to execute: the minting of the NFT.

Finally, the handleOps function concludes by compensating the bundler for the gas used in the process.

Summary of Key Steps:

  1. UserOp Formation: The account abstraction wallet and application frontend collaborate to create a UserOp, containing the smart account's address, nonce, encoded NFT minting instructions, paymasterAndData details, etc. 

  2. UserOp Verification by Bundler: The bundler locally verifies the authenticity and viability of the UserOp. 

  3. Bundler Submission to EntryPoint: After local validation, the bundler sends the UserOp to the EntryPoint contract's handleOps function. 

  4. Signature and Nonce Verification: The validateUserOp function on the smart account verifies that the signature matches the smart account's owner, and the nonce is unique.

  5. Paymaster Verification: The EntryPoint extracts the paymaster's address from paymasterAndData and validates the paymaster's deposit is sufficient through the _validatePaymasterPrepayment function.

  6. Paymaster UserOp Authorization: The EntryPoint calls the validatePaymasterUserOp function on the paymaster contract to confirm sponsorship eligibility for the UserOp.

  7. Execution of NFT Minting: Assuming successful validation, the EntryPoint contract calls the smart account's execute function, which interacts with the NFT contract to mint the NFT. 

  8. Compensation of Bundler: After successful execution, the EntryPoint concludes by compensating the bundler for the gas used in processing the transaction.

Paymaster Allowing Users To Pay With ERC-20s

In contrast to the previous examples where transactions were funded using ETH or covered for free by the paymaster, this scenario explores a paymaster setup that allows users to pay for gas with ERC-20 tokens.

Here's a high-level overview of how this differs from previous methods: 

  1. ERC-20 Token Charge for Gas: Users pay for gas in ERC-20 tokens, calculated based on the ETH equivalent of the gas cost plus any markup by the paymaster.

  2. ETH Deposit Maintenance by Paymaster: The paymaster maintains an ETH deposit in the EntryPoint contract to cover actual gas costs, despite accepting ERC-20 tokens from users.

  3. Batched Transaction Process (Approve Paymaster): 

    1. Initially, users approve the paymaster to spend their ERC-20 tokens. This approval is necessary for the paymaster to subsequently transfer the tokens from the user's account. 

    2. Following the approval, the next transactions in the batch execute the user's intended onchain activities.

  4. Token Amount Calculation, Balance Check, and Context Returned in _validatePaymasterUserOp

    1. Following signature verification, this function determines the required ERC-20 token amount to cover gas fees, considering necessary ETH prepayment and additional charges

    2. It checks if the user's account has enough of the ERC-20 token (fee token) to cover the calculated charge.

    3. Returns a Context with essential transaction details, including the required token amount.

  5. Post-Operation (postOp) Processing on Paymaster including Fee Transfer

    1. The function begins by decoding the context. It extracts key data such as the user's account, the fee token, and pricing details. 

    2. Next, it calculates the effective exchange rate for converting the gas costs to fee tokens, which is done using either the data provided in the context or by fetching new data from an oracle aggregator. 

    3. Following this, the function calculates the charge amount. This involves converting the actual gas cost into the equivalent amount in fee tokens and then applying the price markup. 

    4. Finally, the function transfers the calculated charge from the user's account to the designated fee receiver.

It's important to note that the approach described above reflects Biconomy's specific implementation of a token paymaster. However, paymaster contracts can be tailored to a developer's requirements as long as they comply with ERC-4337 standards, offering various levels of customization in terms of how fees are calculated, processed, and collected.

Customization and Flexibility in ERC-4337

Smart Accounts

In ERC-4337, each smart account is required to include the validateUserOp function. Developers have the flexibility to implement various types of verification logic within this function, depending on their specific requirements. The validateUserOp function should either confirm successful verification by returning a '0' or to indicate a failure in the verification process by returning a specific error reason, such as SIG_VALIDATION_FAILED, to the EntryPoint.

Following the verification process, if a UserOp is successfully validated, the smart account is then responsible for executing the actions specified in the UserOp's callData. Additionally, developers can program the smart account to perform other actions during the execution phase.

Advanced Functionality in Smart Accounts

Smart accounts in ERC-4337 can incorporate a variety of advanced functionalities that extend beyond basic transaction validation and execution. These advanced functionalities, referred to as modules, can be easily integrated into smart accounts. I will cover these modules in detail in a later section, where I discuss an ERC that proposes a standardization for them.

One such module is the use of session keys, which allow users to temporarily delegate transaction signing authority for specific activities, like granting a game permission to sign on their behalf for a set period. Another intriguing module is the implementation of subscription models. Users can program their smart accounts to handle recurring payments, such as a monthly subscription fee to services like Netflix where Netflix can pull x amount of coins per month and no more. For financial management and security, smart accounts can also enforce spending limits. Users can set a cap on the amount of money or value of assets that can be transacted monthly, providing control over their financial activities. Asset protection is another vital module. Users can set rules in their smart accounts to safeguard valuable assets. Any transaction attempting to transfer these protected assets, like a Pudgy Penguin NFT, would be automatically reverted. Additionally, role based access control in smart accounts allows for the creation of specific roles with defined permissions for specific signing key holders. For example, a role can be established for trading on a particular UniSwap pair with restrictions on the percentage of profits the key holder can access.

You might be wondering, can't these features already be implemented with EOA meta-transactions or pre-approved batch signatures? The answer is yes, but there are distinct advantages to using smart account modules. These modules enable onchain verification of conditions and allow for the simultaneous use of multiple relayers. In contrast, other methods might require giving up control of your EOA private key or relying on a second party's countersignature, which introduces potential risks of censorship or denial-of-service issues.

Paymasters

Paymasters in ERC-4337 are required to have a validatePaymasterUserOp function, to authorize users to use the paymaster. Additionally, they may include a postOp function to manage tasks after the operation, such as transferring ERC-20 fee tokens from the user or refilling the EntryPoint contract with ETH.

The design of paymasters allows for customized fee sponsorship. Developers can create paymasters that cover transaction fees in exchange for ERC-20 tokens, offer free transactions under certain conditions, or implement tiered fee structures based on user behavior. This flexibility enables the development of various business models within the Ethereum ecosystem.

For instance, paymasters can offer transaction fee discounts for specific token holders, or create loyalty based fee coverage programs. They can also adapt to dynamic pricing models that respond to market conditions or user engagement, providing innovative ways to manage transaction fees.

Social Recovery and Multi-Signature Smart Accounts

Social recovery in smart accounts introduces a user friendly approach to account recovery, mitigating the risks associated with traditional methods like relying solely on a seed phrase. Users appoint trusted contacts such as friends, family members, or trusted institutions as recovery agents within their smart account. This setup eliminates the worry of losing access to assets due to lost or forgotten seed phrases.

The recovery process operates on a consensus model. For account recovery to be initiated, a majority or a predefined number of these appointed agents must agree and provide their signatures, effectively authorizing the recovery action. This approach ensures that recovery is not solely dependent on a single individual's decision.

Smart accounts offer the flexibility to set specific criteria or conditions for recovery activation. These could include time delays, location based triggers, or other predetermined conditions, adding a layer of control and customization to the recovery process.  Smart account wallets should offer users straightforward options to select their recovery agents and define the conditions for account recovery. 

Similar to Gnosis Safes but now, these multi-signature smart accounts are treated as first class citizens, eliminating the worry of them being a pain to work with. These accounts requires multiple signatures from a predefined group of addresses for a UserOp to be executed. It's particularly beneficial for organizations or joint accounts where consensus is needed for transactions or just to beef up your security. This collective authorization approach enhances security by distributing control, making unauthorized access more difficult.

Diverse Validation (Signing) Methods

Before the introduction of ERC-4337, verifying the authenticity of a transaction required signing an EOA using the ECDSA on the secp256k1 curve. Now, you have the flexibility to sign your transaction using any method you prefer, enhancing security, user experience, and aligning with various use cases. These validation methods can even be plugged in as modules to your smart account after it's already deployed. Some examples include:

Passkey-Based Systems with Secure Enclave Integration: Utilizing biometric features like Face ID or fingerprints, this method enables transactions to be signed through biometric authentication. By generating cryptographic signatures within a device's secure enclave, triggered by biometric authentication, this method significantly reduces the exposure of private keys. This ensures that the key material never leaves the secure hardware, providing a robust layer of security against external threats. A key distinction is that passkeys utilize ECDSA on the secp256r1 curve. In your smart account's validateUserOp function, a prewritten function can now be used to verify signatures made with this method. I will cover passkeys in detail in a later section about an EIP that proposes a new precompile aimed at reducing the gas cost for verifying the secp256r1 curve used by passkeys.

MPC (Multi-Party Computation): Enables multiple parties, each holding a portion of private data, to collaboratively compute a function without disclosing their individual inputs. In the context of account abstraction, MPC is used for the secure generation and management of signing keys. The private key in MPC is divided and distributed among various parties, and these shares are combined to create a complete signing key only when authorized. There are various MPC providers, differing in the number of nodes holding a key shard, cryptographic methods for key recreation, and authorization options for key recreation. Some providers allow users to authorize the recreation of their key for transaction signing through methods like email verification with a magic link or by signing in with an OAuth-enabled web2 provider such as Gmail, Facebook, or Twitter.

Quantum-Resistant Algorithms: With advancements in quantum computing, there is a growing need for quantum-resistant cryptographic algorithms. These algorithms are designed to be secure against the potential future threat of quantum computers, providing a more forward looking approach to transaction signing.

Aggregator Contract

When users transact on a Layer 2, they incur a gas fee. This fee is composed of two parts: the cost to execute the transaction itself and the cost to post transaction data back to Mainnet. The latter represents the bulk of the expenses, accounting for approximately 95% of the total fee charged to users. As Mainnet continues to upgrade, incorporating features like Proto-Danksharding, a solution for more efficient temporary data posting, and Danksharding, which enables nodes to verify complete data submission through probabilistic checks, we can expect these data posting costs to decrease. Nevertheless, the aim is to further reduce Layer 2 costs.

ERC-4337 introduces an approach to alleviate this by minimizing the amount of data Layer 2 networks need to post when using ERC-4337. It leverages an Aggregator Contract, which effectively handles verifying multiple UserOps. Traditionally, each UserOp would require its own signature, often around 65 bytes or more, contributing significantly to the data footprint and, consequently, the cost. However, the Aggregator Contract simplifies this process by removing the need to include a signature field in each UserOp. Instead, the aggregator contract collects a batch of transactions and validates them with a single, combined signature using BLS cryptography. This BLS signature is considerably shorter than the cumulative total of individual UserOp signatures. Since Layer 2 networks pay Mainnet to post this data, the aggregation of signatures into a single, concise BLS signature translates into substantial savings.

To utilize this feature, bundlers have an extra responsibility. They must collect the individual signatures from various user operations and merge them into a single BLS signature. Following this step, they can compile multiple sets of aggregated user operations, with each set bearing one BLS signature. However, instead of using the standard handleOps function on the EntryPoint contract, they must call a different function called handleAggregatedOps designed for this purpose.

The rest of the process largely mirrors the standard ERC-4337 process, with a notable difference during the verification stage. During this phase, instead of contacting each user's smart account one by one for validation, the system consults the aggregator contract specified in the UserOpsPerAggregator struct. The aggregator contract, then, verifies the batch of transactions collectively, which streamlines the operation.

The Future

EIP-7212: Implementing WebAuthn Standards in Ethereum

WebAuthn, developed by the FIDO Alliance and the World Wide Web Consortium (W3C), is a web standard for secure, passwordless authentication. It enables users to log in to websites with biometrics, mobile devices, or FIDO security keys, enhancing security and simplifying the login process. During registration, a user's device generates a public-private key pair, similar to creating an Ethereum EOA, with the public key sent to the website's server and the private key retained securely on the device. For login, the website requests authentication, and the device signs a challenge with the private key, proving identity without a password.

Passkeys are a user authentication method based on the WebAuthn standard. One place to store them when using an Apple device is using iCloud Keychain. When a user signs in via iCloud, it becomes possible to recover their Passkeys. However, while iCloud Keychain is secure, it doesn't offer hardware-level security. The private key is briefly copied in plain text into the system memory, creating a potential attack surface. If an app is compromised, there's a risk to the key. Alternatively, Passkeys can be secured using the iPhone's Secure Enclave, an isolated microchip within Apple's chips designed for secure data and operations. The private key is securely generated and stored in the Secure Enclave, ensuring it never leaves the hardware. This means that using biometrics to access a Passkey stored in the Secure Enclave offers maximum safety, though it trades off ease of recovery.

You might be thinking, "This sounds great, we can use the Secure Enclave to hold our Passkeys, authenticate them with Face ID, and then sign transactions using a passkey, which are verified by an account abstraction wallet." However, it's not that simple. When I mentioned, "During registration, a user's device generates a public-private key pair, similar to creating an Ethereum EOA," I used the word 'similar' intentionally. It's a comparable process but not exactly the same. You might not recall, but earlier in this post, I mentioned that Ethereum, like Bitcoin, uses the ECDSA with the secp256k1 elliptic curve to generate its public-private key pairs. In contrast, the WebAuthn standard employs a different elliptic curve, known as secp256r1, for generating its public-private key pairs.

A problem arises because verifying signatures on the secp256k1 curve is one of the nine precompiled contracts in the EVM, but this isn't the case for the secp256r1 curve. A precompiled contract is a special type of contract, integrated directly into the EVM, instead of being user written and deployed. These contracts offer cheaper gas costing functions for specific operations than if the same functionalities were executed through smart contracts. Among these is the ecrecover function. It processes the signature and the message to return the signer's address. However, ecrecover is compatible only with the secp256k1 curve and not the secp256r1, which is used in passkeys. Being a precompiled contract, the cost of using ecrecover is relatively low, at about 3k gas.

EIP-7212 proposes making the verification of signatures on the secp256r1 curve as efficient as on the secp256k1 curve by incorporating it into a precompiled contract. Initially, the Clave team, who co-authored the EIP, found that using Solidity for verifying keys on the secp256r1 curve cost around 400k gas. Optimizations later reduced this to roughly 70k gas. However, it's important to note that each public key requires its unique set of pre-computed data, tailored to its mathematical properties, to optimize signature verification. Thus, the pre-computation and deployment are specific to each key. To verify signatures from a different public key, new pre-computation and a new contract deployment are necessary, which is notably gas intensive, costing about 3.2 million gas. For a deeper understanding of the need for pre-compilation and redeployment, I recommend listening to the Web3 Galaxy Brain episode featuring the Daimo Team, the other half of the EIP-7212 authors, available here. Alternative methods for verifying signatures on the secp256r1 curve exist, including SNARK-based approaches and various Solidity implementations. Nonetheless, the overarching issue remains the same: verifying signatures on the secp256r1 curve within the EVM is highly gas consuming, and EIP-7212 aims to make it as cost effective as verifying signatures on the secp256k1 curve.

The EIP seems like an obvious choice to include in the protocol, and guest after guest on Web3 Galaxy Brain seem to agree. If we're aiming for a web2-like user experience, making signature verification for biometric keys as affordable as for EOAs is essential. However, Vitalik has expressed caution about adding new precompiled contracts. In his blog post titled “Should Ethereum be okay with enshrining more things in the protocol?”. He pointed out that previous precompiles, like RIPEMD and BLAKE, haven't been used as much as expected, suggesting we should learn from that. Whether this EIP will be included in the protocol remains uncertain, but it's definitely one to watch closely.

ERC-6900: Modular Smart Contract Accounts and Modules

As you now know, if you’ve made it this far, ERC-4337 introduces custom logic in the validation and execution stages of a transaction. It features modules like session keys, subscriptions, spending limits, and role-based access control. Currently, these modules vary across different smart account providers, creating uncertainty about module compatibility and potential duplication of development efforts. For example, imagine this situation: You visit Alchemy’s smart account interface and set up your new smart account. You also install a module from Alchemy that restricts you from spending more than 10 ETH in a single transaction. Later, you decide to try the Obvious Wallet interface to interact with your smart account, similar to how you might export your private key from Metamask to Rainbow for EOAs. However, when you use the Obvious Wallet interface to access your smart account, you find that the module you installed with Alchemy doesn’t work, and you can spend over 10 ETH in a transaction. This is because Obvious Wallet doesn’t recognize the module from Alchemy. Right now, if you're a developer wanting to create modules, you either get locked into developing for one wallet interface or you have to make separate modules with the same functionality, each compatible with a different wallet interface.

ERC-6900 aims to standardize these modules, simplifying the creation and management of them for developers. This would lead to a more streamlined user experience, as developers could create modules usable across any compliant smart account interface, reducing their workload and enhancing data portability. The standard splits the smart account into more modular components. At its base, the smart account remains simple, but users can add modules to enhance functionality. These modules come in three types:

  • Validation Functions: Check whether the person or entity trying to use the smart account is authorized to do so

  • Execution Functions: The core functions that dictate what the smart account actually does when it's used

  • Hooks: Additional pieces of logic that execute before or after the validation or execution functions

Typically, smart accounts are accessed through the EntryPoint contract, though this isn't the only method. Owners of smart accounts, whether they are EOAs or other smart contracts, have the option to bypass the validateUserOp function. They can choose to directly invoke functions for executing transactions onchain. ERC-6900 is designed to support both these access methods. As a result, it defines more specific categories of validation functions, execution functions, and hooks to cater to these varied approaches. The diagram below illustrates all these specific types of functions. Following the diagram, I will provide a detailed definition of each, along with multiple examples to demonstrate their use and functionality:

  • User Operation Validation Functions: Handle the validation of user operations in smart contract accounts. They are the main validators for ensuring the transaction complies with the contract's rules.

    • Examples: 

      • Passkey Secured Account: A User Operation Validation Function confirms the correct passkey for each transaction.

      • DeFi Credit: A function checks transaction criteria for Buy Now Pay Later services.

  • Runtime Validation Functions: Executed just before the main execution step of a modular account function, particularly for calls that are not made through a UserOp (like direct smart contract interactions).

    • Example: 

      • Check whether the caller trying to modify important settings is indeed the account owner or a designated admin. If the caller is not authorized, the function would prevent the execution of the update.

      • Virtual Cold Storage feature: A Runtime Validation Function could verify high security measures (like 2FA or multi-sig) are met before allowing transactions with certain assets

  • Execution Function: Defines the main execution step of a function for a modular account.

    • Example:

      • A function that performs a token transfer from the smart contract account to another address.

      • Dollar-Cost Averaging: The Execution Function would periodically execute buy orders for a specified cryptocurrency, following a user defined investment strategy.

  • Pre User Operation Validation Hook Functions: Run before the main user operation validation functions. They are used for setting preconditions or carrying out initial assessments that might impact the validation process.

    • Example: 

      • Daily Transaction Limit: Checks whether the total value of transactions initiated from the account on that day has already reached the preset limit.

      • Timed Unlock: Verifies if the current date surpasses a predefined unlock date

  • Pre Runtime Validation Hook Functions: Executed before the runtime validation functions, setting up preliminary conditions or checks for the validation process.

    • Example: 

      • In automated trading, a function could check current market conditions or a "trading suspension" flag. If active, it prevents the trade execution

      • For an Exploit Detection system, a Pre Runtime Validation Hook Function might check real-time security alerts. If an exploit is detected in the protocol being interacted with, it blocks the transaction to protect the user.

  • Pre Execution Hook Functions: Run just before the main execution step of a transaction or operation in a smart contract account. They can perform preliminary actions or computations and optionally pass data to post execution hooks.

    • Example:

      • In Crowdfunding: Check if the campaign has reached its funding goal before allowing a new contribution. If the goal is already met, the function might either reject the transaction or redirect the funds to a different function. 

      • Automated Token Trading: A function that analyzes market conditions using external oracles before execution.

  • Post Execution Hook Functions: Executed after the main execution step of a transaction or operation. They can handle cleanup, state resetting, or further processing based on the outcome of the execution and data received from pre execution hooks.

    • Example: 

      • In Crowdfunding: Responsible for sending a thank you message or a digital reward to the contributor after their contribution is successfully processed.

      • NFT Rental service: Handle the transfer of NFT access rights back to the owner after the rental period ends, ensuring the NFT is returned automatically.

These examples just scratch the surface of functionality developers can create using modules. For more inspiration, the Rhinestone team has compiled a list of ideas here. To understand the specific formatting requirements for modules, you can refer to the ERC documentation here. Keep in mind that this standard is still a work in progress. Modular account abstraction teams are actively refining the specifications, so it's important to stay updated with the latest information online.

Biconomy and Rhinestone are launching the Module Store, an app store for modules, backed by the Rhinestone Protocol. This protocol facilitates the lifecycle of smart account modules, from development to installation and monetization. The Module Store aims to ensure interoperability, security, and ease of use for modules. Starting in Q1 of 2024, all dapps and wallets built on Biconomy’s embedded smart account framework will have access to the Module Store. This marketplace will feature smart account modules, developed by various creators and secured by a network of auditors, offering a seamless plug-and-play solution for dapp developers. For further details on the store, you can read the announcement here.

Managing Smart Accounts Across Different Chains

In a future where Layer 2s are widely used and everyone has smart accounts, we encounter a significant challenge. If you wish to change the keys controlling your smart account on one L2, you currently have to repeat this process individually on each L2 you use. For example, if your smart account is managed by four EOAs and you decide to remove one, this change has to be implemented separately on each L2.

Vitalik’s Keystore Proposal

To solve this, Vitalik has suggested a Keystore Contract for each user, which could be located on either L1 or L2. This Keystore Contract would keep a record of your valid signing keys and the rules for changing them. Your smart account would then always reference the Keystore Contract to verify the current valid signing keys. Vitalik discusses various methods for this, including secure cross-chain message transmission, choosing the best proof methods, ensuring privacy, and more. For an in depth understanding, you can read his full post.

Module Solution

The Multichain Validation Module developed by Biconomy offers a solution for managing smart accounts across various blockchain networks and rollups. It allows for the authorization of multiple operations on different chains with a single user signature. This is achieved by using Merkle Trees to combine multiple UserOp hashes into one Merkle Root, which the user signs. This method enables actions such as deploying and configuring smart accounts on several chains or issuing session keys with different permissions across chains, using just one signature. The module employs a trust model similar to the general ERC-4337 flow, where users must trust the app generating the Merkle Root to not include any malicious UserOp hashes. For more.

ERC-1271: Verifying Signatures from Smart Contracts (Accounts)

There are times when a user signs a transaction offchain, and then someone else, like another user or a relayer, submits the signed message on behalf of the user onchain. For example, imagine a user wants to buy a specific Pudgy Penguin NFT for 3 ETH on OpenSea. They sign a message about this offer offchain and send it to OpenSea's servers. Later, when the owner of the Pudgy Penguin decides to accept this offer, they use the signed message and their signature to instruct the OpenSea contract to complete the purchase. Another scenario could involve a DeFi app offering gasless transactions for their users. In this case, users sign messages offchain for actions like trades or loan requests, then send these signed messages to the DeFi protocol's servers, which subsequently submits them to the blockchain, covering the gas fees. In both these situations, and others similar, the traditional method of handling signed messages onchain, before the introduction of ERC-1271, involved using the ecrecover precompile function in Solidity. ecrecover takes the signature and message and returns the signer's address. However, this method has its limitations, as it works only with transactions signed by EOAs and not with those originating from smart contracts (which is what smart accounts technically are).

ERC-1271 is a crucial standard for fully realizing the benefits of account abstraction. Although smart accounts are now first class citizens in the EVM with ERC-4337, this alone isn't sufficient if smart contract developers continue to use ecrecover for signature validation. Unlike EOAs, smart contracts don't have private keys and therefore can't create standard Ethereum signatures. ERC-1271 enables smart contracts to 'sign' data. ERC-1271 allows them to define their own criteria for what constitutes a valid signature. A smart contract following ERC-1271 includes a isValidSignature function. When another contract needs to confirm a signature claiming to be from a specific smart contract, they call its isValidSignature function inputting the data's hash and the signature. If the isValidSignature function deems the signature valid, it returns a special value, 0x1626ba7e. Otherwise, it should return a different value, such as 0xffffffff. This function contains custom logic to determine the validity of a given signature, which can vary based on the contract. For instance, a multi-signature contract might verify if the signature matches any of its approved signers. Or it could involve calling other functions within the contract, examining the contract's own state, or even interacting with external contracts to assess a signature's validity.

Let's revisit the OpenSea Pudgy Penguin example to clarify how this process works. Suppose a user has a smart account where Face-ID passkeys act as the signer. When this user decides to make an offer to buy a Pudgy Penguin NFT for 3 ETH, they create a commitment or instruction stating, "I agree to buy this NFT for 3 ETH." This commitment is signed using their Face ID and both the commitment and signature are submitted to OpenSea's servers. Later, when the current owner of the NFT decides to accept this offer, OpenSea begins processing the transaction. This involves interacting with the buyer's smart account. The smart account has an isValidSignature function, which is compliant with the ERC-1271 standard. This contract's isValidSignature function would be similar in functionality to ecrecover, but in this case, it would most likely call out to an external contract (or, perhaps by the time you're reading this, EIP-7212 has been implemented and there's a precompile to verify signatures made by passkeys), inputting the message and signature and outputting a public key. This external function would use ECDSA on the secp256r1 curve. If the output matches the public key stored in the smart account, it will return 0x1626ba7e, signaling that the signature is indeed valid. Once verified, OpenSea executes the transaction on the blockchain. This action transfers the NFT from the seller to the buyer and moves 3 ETH from the buyer's wallet to the seller.

Enshrining Account Abstraction

ERC-4337 offers a significant advancement in Ethereum's functionality, but it's not without its drawbacks. One of the main challenges is that existing users cannot seamlessly transition from their current EOAs to this new system. They would need to transfer all assets and activities from their EOAs to smart accoun. Additionally, using ERC-4337 costs more gas than the traditional method – approximately 42k for basic user actions, which is nearly double the 21k needed for standard transactions. It also reduces effectiveness of anti-censorship methods like crLists, depends on fewer nodes, and uses of unconventional methods such as eth_sendRawTransactionConditional. Another compatibility issue arises with tx.origin or contracts dependent on it, as ERC-4337 uses the address of a bundler. Given these considerations, it’s essential to explore different proposals and implementations that could integrate an account abstraction solution directly into the Ethereum protocol. In the following sections, we'll dives into these alternatives and their potential to enhance Ethereum's user experience.

RIP-7560: Integrating Account Abstraction into EVM-like Rollups

RIP stands for Rollup Improvement Proposal (yes, I know, we're not great at naming in Ethereumland, get used to it). It's different from EIPs in that RIPs propose upgrades specifically for EVM-like rollups. Rollups can more easily upgrade their protocol compared to Ethereum's base layer, which is stricter and undergoes extensive testing to minimize the risk of bugs. In a sense, rollups can use RIPs to 'test' new upgrades for Mainnet. Successful implementations on rollups might later be adopted by the base layer. This particular RIP, proposed by Vitalik and others, aims to integrate account abstraction directly into the protocol, rather than as an upstream layer like in ERC-4337. It addresses some issues with ERC-4337 while facilitating an easy transition from ERC-4337 to RIP-7560.

This proposal introduces a new transaction type, that looks similar to a User Operation, named AA_TX_TYPE. This type includes several fields, including chainId, sender, nonce, builderFee, callData, paymasterData, deployerData, maxPriorityFeePerGas, maxFeePerGas, validationGasLimit, paymasterGasLimit, callGasLimit, accessList, and signature. It also proposes a new approach to managing nonces for smart accounts with a two-dimensional system utilizing a key and a sequence value. The NonceManager, a dedicated contract, administers this system, tracking these nonces to ensure their correct application.

Here’s process for nonce verification:

  1. Initiate a transaction from your smart account and select a key that categorizes the transaction.

  2. Check the current sequence value for that key using the NonceManager

  3. Submit your transaction with the chosen key and the next sequential value.

  4. The NonceManager verifies if the sequence value is correct (the next in line for that key) and then increments it for subsequent transactions.

Here’s the flow to verify and execute a transaction:

  1. Validation Phase:

    1. Nonce Validation and Increment: Updates the transaction count for an account.

    2. Sender Deployment: If it's the sender's first time, in this step a new smart account is deployed.

    3. Sender Validation: Ensures the transaction details are accurate and validates signature.

    4. Paymaster Validation (Optional): A paymaster, if involved, validates its role in the transaction.

  2. Execution Phase: 

    1. Sender Execution: Carries out the planned actions of the transaction.

    2. Paymaster Post-Transaction (Optional): The paymaster, if part of the transaction, carries out any needed actions after the transaction is done.

This RIP, while currently not fully detailed in every aspect, is essentially a proposal to incorporate the advantages of account abstraction into a rollup. It aims for a smooth transition from the existing ERC-4337 standard. I only gave you a high-level overview here, not touching on every aspect of the RIP. To get a deeper understanding of the proposal, you can read the complete RIP at this GitHub link. For the most recent updates on this proposal, keep an eye on the discussion thread at the Ethereum Magicians forum, available here.

zkSync Native Implementation

zkSync is an EVM compatible Layer 2 rollup. Unlike EVM equivalent rollups that closely mirror the Mainnet EVM, zkSync made specific design choices that slightly modify their virtual machine. These adjustments not only make its bytecode more SNARK-friendly but also allow for the inclusion of features absent in Ethereum's base layer. One notable feature is the native implementation of account abstraction. zkSync has already integrated smart accounts and paymasters, similar to what we're familiar with, but built directly into its protocol.

Similar to RIP-7560, zkSync also had to come up with a method for verifying the nonces of smart accounts. Their solution allows users to select any 256-bit number as a nonce, with the stipulation that each nonce can be used only once. To enforce this, they utilize an external contract named the NonceHolder. For developers working with zkSync, it is recommended to adhere to the account and paymaster interfaces as specified by zkSync.

Accounts on zkSync are required to implement specific methods. Here's a breakdown of them and their functionality:

  • validateTransaction: Used during the transaction's validation phase. It sets the rules for programmable verification of transactions. If a transaction fails these verification rules, this method must revert the transaction.

  • payForTransaction or prepareForPaymaster:

    • payForTransaction: Used when a transaction does not involve a paymaster. Accounts should implement it to manage the transaction fees themselves.

    • prepareForPaymaster: Called when a transaction involves a paymaster. If it executes without reverting, the system proceeds to call the paymaster’s validateAndPayForPaymasterTransaction method, passing along the necessary data.

  • executeTransaction: Executes the actual transaction as per the defined instructions.

  • executeTransactionFromOutside (Optional, but Recommended): Allows external entities to initiate transactions from the account. This functionality is similar to the standard Ethereum model, where an EOA can initate transactions from a smart contract.

For paymasters, the following methods are necessary:

  • validateAndPayForPaymasterTransaction: Used to determine whether the paymaster is willing to cover the costs of a specific transaction. If the paymaster agrees to pay, it must transfer an amount no less than tx.gasprice * tx.gasLimit to the operator. Additionally, the method should return a 'context', which is used by the postTransaction method, provided the latter is implemented.

  • postTransaction (Optional): Called after the transaction has been executed. It's used for any additional logic or actions the paymaster may need to perform post-transaction. This could include steps like cleanup or bookkeeping.

Here's an explanation of each step in the transaction flow:

  1. Validation Step:

    1. Nonce Check: The system first checks the transaction's nonce using the NonceHolder. It confirms the nonce hasn't been used before.

    2. validateTransaction Method: The system calls the validateTransaction method of the account involved to validate the signature. If this method does not revert, the process moves forward.

    3. Nonce Usage Verification: After the validation, the NonceManager marks the nonce as used, updating its status in the system.

    4. Fee Payment (Without Paymaster): For transactions that don't involve a paymaster, the payForTransaction method of the account is called to handle the transaction fees.

    5. Fee Payment (With Paymaster): If a paymaster is involved in the transaction, the system first calls the prepareForPaymaster method of the sender's account. If successful, the system then proceeds to call the validateAndPayForPaymasterTransaction method of the paymaster.

    6. Bootloader Fee Verification: The system checks whether the bootloader (responsible for transaction execution) has received the required transaction fees, calculated as tx.gasPrice * tx.gasLimit. If the fees are appropriately received, the transaction successfully passes the validation phase.

  2. Execution Step:

    1. executeTransaction Method: The system executes the executeTransaction method of the account, which contains the actual logic of the transaction.

    2. Post-Transaction (With Paymaster): For transactions involving a paymaster, the postTransaction method of the paymaster is called. This step is typically used for refunding any unused gas to the sender, especially relevant when transaction fees are paid in ERC-20 tokens rather than ETH.

EIP-3074

EIP-3074 is still in the review phase and aims to introduce two new opcodes to the EVM that will allow smart contracts to send transactions in the context of an EOA. Essentially, this means that when these new opcodes are used, a smart contract can send a message or call another smart contract, and it will appear as if the message is coming from an EOA, not from the calling smart contract. This is a significant change because it means you don't need a smart contract account to perform these actions. You can simply use what's known as an invoker contract, which is a straightforward, secure type of smart contract that can execute transactions on behalf of your account. EIP-3074 doesn't create a new type of transaction, it simply adds two new opcodes to the EVM.

To illustrate how it works from a high level, here's a walkthrough of a simple example. In a ERC-20 contract, the transfer function might look like this:

function transfer(address _to, uint256 _value) public returns (bool success) {
  require(balances[msg.sender] >= _value);
  balances[msg.sender] -= _value;
  balances[_t0] += _value;
  balances[recipient] = _balances[recipient].add(amount);
  emit Transfer(msg.sender, _to, _value);
  return true;
}

With EIP-3074, you can sign a message from your EOA indicating a desire to transfer tokens. This signed message is sent onchain to your invoker contract. The invoker contract then executes the actions, which include calling the transfer function of the ERC-20 contract. Normally, msg.sender in the ERC-20 contract's transfer function would refer to the invoker contract. However, EIP-3074 changes this: msg.sender can be set to your EOA, even though the call to the ERC-20 contract is made by the invoker contract. This allows you to transfer tokens from your EOA by calling the ERC-20 contract through your invoker contract, changing the traditional understanding of msg.sender. Therefore, In this system users don't need to transfer assets out of their EOAs to take advantage of it.

The two new opcodes in EIP-3074 are AUTH and AUTHCALL. AUTH is used to set up a special kind of permission using a digital signature, specifically an ECDSA signature. This process establishes an authorized context in the EVM for a particular user account. To use AUTH, a signature and a hashed message, known as a commit, are required. The EVM uses this information to identify the signer of the message, and this signer's address is marked as authorized.

Following this, AUTHCALL comes into play. It allows transactions or messages to be sent as if they are coming from the authorized user account, not the smart contract that is actually initiating the action. It's similar to the call opcode, but it sets the authorized context address as the msg.sender. This is a major shift because it means that the actions look like they are coming from a regular user's EOA rather than from a smart contract. When another smart contract receives a message or transaction and checks the sender (using msg.sender), it sees the address of the authorized EOA, not the smart contract that made the call. This mechanism allows smart contracts, known as invokers, to perform actions that were previously exclusive to user accounts, allowing a lot of the benefits of account abstraction to be unlocked.

A basic flow to send one or more transactions looks like this:

Under the EIP-3074 system, who sends a transaction to a contract is not important as long as the signature from the EOA is valid. This flexibility allows a transaction to be sent by someone else or a different account, not just the account owner. However, it's important to note that currently, you can't use EIP-3074 to send ETH directly from an EOA. This limitation is due to the need to maintain certain key assumptions in the Ethereum network, like how a transaction's validity is determined. Instead, any ETH involved in these transactions comes from the balance of the invoker, the smart contract initiating the action. Although you can't directly send ETH from an EOA using EIP-3074, you can still transfer ETH to the invoker contract. The invoker can then use this ETH as needed for the transactions it handles. For more information, you can read the full EIP post here and follow the subsequent Ethereum Magicians discussion here.

Conclusion

Hopefully, you now have a solid understanding of Ethereum's accounts and transactions. We've looked at how they worked before ERC-4337, how they function with the adoption of ERC-4337, and what the future holds for account abstraction. It's been a challenging journey toward account abstraction. Sometimes, I wonder how the Ethereum user experience might have looked if Vitalik and the other founders had managed to launch Ethereum with built-in account abstraction, as they originally intended. Looking at zk-Sync's clean and straightforward in-protocol implementation of account abstraction, it's clear what could have been. However, the developers are already stretched thin, and I'm grateful for ERC-4337, despite its complexities, for bringing us the benefits of account abstraction today.

The development of Ethereum is thrilling to me, and I can't imagine dedicating my time to anything else. Just last year, Ethereum accomplished a monumental technological feat by transitioning from proof of work to proof of stake. Next year, the protocol is set to scale like never before, introducing a new dedicated temporary dataspace for rollups. This change will significantly reduce the cost of transactions on layer two. Meanwhile, the user experience is undergoing a major transformation. ERC-4337 is live, and now it's about building applications that leverage it. I'm excited for the next wave of users who won't have to worry about writing down 12 words on a piece of paper or needing to click multiple times just to swap tokens. A new era has arrived, and it's time to start building.

Loading...
highlight
Collect this post to permanently own it.
Jason Chaskin logo
Subscribe to Jason Chaskin and never miss a post.