WeaveVM’s solution to the EVM world state growth bottleneck (vol. 1)

WeaveVM’s solution to the EVM world state growth bottleneck (vol. 1)

February 14, 2024

The central data structure governing accounts, balances, code, and contract storage in an EVM-based blockchain is the world state trie. The effective management of this data structure is a pivotal concern in scaling these blockchains. The storage location, access speed, and modification efficiency of the trie are critical parameters influencing the performance of the network client. In essence, the optimal handling of the world state trie serves as a cornerstone in the pursuit of a scalable and robust blockchain infrastructure.

Currently, it’s necessary to keep the statebase on Solid State Drives (SSDs) because hard disks are just too slow for the scattered reads that network clients usually need. SSDs make sure we get the speed and responsiveness we need, meeting the specific requirements for efficient data retrieval in network clients.

ℹ️ In this article, Statebase and World State Trie are used interchangeably.

Attempts to tackle the world state trie growth bottleneck

Over the past few years, improvements in network clients have been noticeable, like increasing block gas limit. However, fine-tuning the statebase has proven to be quite a challenge. Currently, the statebase stands out as the bottleneck of the optimal EVM performance. For instance, the Erigon/TurboGeth team switched statebases several times in their pursuit of enhanced performance. To address this ongoing challenge, a new statebase, LMPT, has been designed for the purpose.

The challenge has also been approached from a protocol perspective. The introduction of Access-lists (EIP-2930) allows for the pre-fetching of statebase data, minimizing stress during subsequent block execution. Moreover, the pricing of EVM state I/O opcodes has undergone multiple revisions to accommodate for slowdowns in the statebase, especially as the state has expanded over time.

Last year, and while working on Fantom Sonic, the Fantom team conducted a network analysis over the first 40m mainnet network blocks to identify block processing components dominating the runtime. And the results came as follows: the EVM consumed 13% of the time on processing and executing smart contracts. The StateDB consumed 84%, spent on the Trie Hash; it’s clear that the EVM network takes excessive time to access the storage in the StateDB (Statebase), which necessitates enhancement solutions.

There is also EIP-4444 that seeks to lower the hardware requirement of Ethereum nodes by purging old data, like transactions over a year old. This cuts down on storage needs. But, it means nodes need new ways to stay updated without having all transaction history. Solutions include using third party centralized services, storing data off-chain, using data torrents, and snapshot services. This makes nodes more accessible without losing important data. The EIP-4444 implementation is planned to happen in 2024 as per the Ethereum’s roadmap shared earlier this year:

Ethereum 2024 roadmap

WeaveVM’s solution: a hyper-scalable sovereign rollup

A new statebase structure approach: ACID DB

The Ethereum yellow paper leaves the choice of database structure for network clients open-ended. Implementers can explore different database approaches in terms of architecture and design. In Ethereum’s design, Vitalik adopted a structure where nodes are connected by hash digests, adding a layer of abstraction to storage. Although an EVM-compatible blockchain could use an alternative structure, only a few have tried this so far. Despite challenges from hash indirection, clients can use inherent structural patterns to access node data efficiently.

WeaveVM has chosen to use an ACID-compliant database for storing world state trie data on Arweave. This decision ensures immutability and data durability. With ACID compliance, WeaveVM benefits from an enhanced and a horizontally scalable design for its nodes.

Bundled EVM transactions

Due to WeaveVM’s utilization of Arweave’s data bundling services, such as AR.IO, the WVM can bundle up to 499 transactions within a single data storage transaction. This bundled transaction is then sent both to Arweave for storage and to the WVM sequencer for statebase modification. This approach optimises transaction handling and storage efficiency for WeaveVM.

Arweave as a permanent ledger

Diagram2

In an EVM network, the network’s statebase (W) is outside the blockchain (B). The ledger (L) compromises both while the blockchain (B) determines the world state (W).

In the WeaveVM context, all state changes (transactions) are stored on Arweave permanently, so Arweave data storage will act as the Blockchain (B). 

The world state trie will be a centralised DB with ACID-properties that keeps the latest network’s storage related data up to date with the state modifications. The statebase is determined by the data posted to Arweave.

So if we sum it all together, WeaveVM ledger, which compromises both of the Arweave data storage and the centralised world state trie (sequencer).

Verifiable statebase data

As mentioned in the previous point, the WVM’s statebase will be stored and modified by a centralised sequencer, so in that case it should be tolerant for corruption. 

WVM already solves part of that by using Arweave as the blockchain (B). But however, to prove that the centralized sequencer is serving the correct and untampered network’s final state (calculated by lazy evaluating the data posted to Arweave) – so in that case, the centralized sequencer (aka the statebase (W)) will publish fraud proofs as signatures that facilitate validating the statebase validity based on blockchain data.

Onchain state, offchain execution

WeaveVM is built following the VACP framework guidelines (Verifiable Atomic Computing Paradigm). VACP adds many features to WVM such as RESTfulness, gaslessness, and more importantly, separating state from execution, and conducting the latter off-chain while guaranteeing verifiability. VACP standards make WVM “retro-decentralised by proof” or “lazy-eval decentralized” via the usage of Arweave and fraud proofs.

Let’s expand that by showcasing a lifecycle of an WVM transaction:

WVM transaction lifecycle

For each transaction, a simulator thread invokes a new EVM instance with an in-memory context that simulates on-chain features such as hard forks, gas costs, block environment, precompiled contracts, and seeded by the N-1 statebase as the initial state. If a thread unsuccessfully simulates a transaction, the simulator terminates and reverts the thread without making state changes. If not, the N+1 statebase (the resulting statebase from the transaction simulation) is assigned as the network’s latest statebase.

And for example, apart from the high TPS and near-instant finality, WeaveVM with a separated execution from state drops the contracts deployment cost significantly.

WeaveVM comes very cost effective when it comes to contract deployment. For example, deploying an ERC20 contract cost it total (including Arweave storage cost) around $0.00197 and for a ERC721 factory contract, it costs around $0.002

These costs are very competitive to other L2s and sovereign rollups and that is facilitated due to the VACP basis of WeaveVM.

cost of contract deployment on EVM chains

EVM in the cloud, no hardware requirements

Due to the Statebase data storage approach taken by WeaveVM, on Arweave, the Sequencer (acting as the sole network node) will require no hardware requirement to proceed TXs and storage the Statebase. As stated before, instead of SSDs, WeaveVM Statebase rebuilding components (state transitions requests, TXs) will be archived permanently on Arweave with fraud proofs. 

This feature makes WVM the first cloud-based and completely RESTful EVM-compatible rollup.

Statebase characteristics comparison, WVM vs FEVM

WeaveVM and FVM storage comparison table

WeaveVM’s progress so far

At the time of writing, WeaveVM is working in parallel on a new EVM runtime and custom JSON-RPC to ensure compatibility with Metamask and the whole Ethereum tooling ecosystem.

We aim to publish a substantial update before the end of Q1, and will be attending ETH Denver 2024 - so come meet the team. Follow along with development progress on X at @weavevm.