[RFC] Simplified Mempool Design #4257
Labels
📦 🌐 tendermint v2
Issues or PRs tm2 related
📦 ⛰️ gno.land
Issues or PRs gno.land package related
🌱 feature
New update to Gno
🌟 improvement
performance improvements, refactors ...
Milestone
Uh oh!
There was an error while loading. Please reload this page.
RFC: Simplified Mempool Design
Authors: Milos Zivkovic (@zivkovicmilos), Dragan Milosevic (@Milosevic02)
Date: May 7, 2025
Status: Draft
Disclaimer: This design assumes transactions sequences can be extracted from a Tendermint transaction, which is not the case today
Abstract
This RFC is a continuation of issue #1830, that briefly outlined a few problems with the current Tendermint2 mempool design. It aims to expand on them here as well, with concrete examples on how they can be resolved.
The order in which issues are listed in the abstract are as follows:
Out-of-order transaction inclusion
The current implementation of the mempool module does not consider the account sequence ordering when returning “ready” transactions for block building, causing transactions to fail when being executed as part of a block later down the line.
The mempool currently returns transactions only ordered by their appearance (FIFO), and not by account sequence first. To make matters worse, the gas values by which transactions are included in a block are user-defined values which can be skewed, as discussed later.
This leads to a scenario where the mempool can return transactions for block building that are not in the correct order of sending for an account. For example:
TX 1
(nonce 0) sends a transfer to accountX
, initializing it in the process (the account was uninitialized)TX 2
(nonce 1), by accountX
, deploys a Realm[TX2, TX1]
, since they are ordered by appearanceTX 2
fails because the account is uninitializedTX 1
succeedsTX2
again, to get the desired stateRealm deploys cost more gas and appeared first, so
TX2
will be executed beforeTX1
, since the mempool returns this transaction first.TX 2
will fail because the account is uninitialized. Account balance transfers cost less gas and appeared later, so they will be executed later in the mempool implementation.Regardless of the block-inclusion algorithm (discussed below), the transactions for a block should be picked by their “executability” first — the ones whose nonce matches the next state nonce of the account. Otherwise, nonce-mismatched txs will always fail (we call this out-of-order inclusion), and require a resubmission by the user / dApp / client.
Having this in mind, the only way to guarantee a path to valid transaction execution is to sort transactions for a block by their nonce first, and then by some other parameter that is relevant to the TM2 application, since it is guaranteed these types of transactions are potentially executable, as the nonce check won’t fail.
Block building uses user-defined gas for transactions
The PR #2838 introduced an EIP-1559-style fee model in which validator nodes (who propose blocks) can curate transactions based on their specified gas fee, or rather a "tip” meant to signal the transaction’s willingness to occupy valuable block space.
In a “functioning” EIP-1559 market (Ethereum, for example) the validator’s block building process would sort its candidate transactions by effective fee (
=basefee + tip
), so that users who attach a higher tip (pay more for their tx) see their transactions be committed to the chain earlier. That priority is the entire point of tipping, and minimum transaction fee markets.In practice, TM2 does not honor that contract. Although the proposer does deduct the declared fee and route it to the coinbase (as part of the
AnteHandler
), the selection algorithm that fills the block ignores the fee signal entirely. Instead, it grabs the first (by appearance) largest transactions in the mempool by user-defined gas until the block limit is hit. What this effectively means is, a small (by gas) high-tip payment can be displaced by a much larger (by gas) low-tip transaction, even though the former is paying a higher effective fee per unit of gas. Users therefore receive none of the latency guarantees they believed they were buying.The current implementation of the block building process does not consider the actual gas used by a transaction when trying to figure out if it can fit into a block. Instead, the block building process simply reads the (user defined) gas limit, and sees if the block has enough space to fit the transaction, even though the transaction actually uses less gas than what was specified.
This design eliminates the incentives EIP-1559 introduces.
Users who attach higher tips receive no latency advantage, so rational senders will adapt to minimize their tip and just overstate gas instead. The network could just as well be running on a static minimum-fee rule if all that matters is fitting into the block’s gas limit.
In practice, transaction block inclusion is something which has proven to be natively an application choice, and not a Tendermint Core-one. Applications should be the ones to decide how their transactions are prioritized in blocks. This concept is already adopted and live in Cosmos (via CometBFT) in the form of Mempool lanes.
No support for transaction “swapping”
Building off the previous point, the Tendermint2 mempool currently lacks the ability to “swap transactions” inside the mempool.
This is a QoL feature that allows users to “speed up” their transactions in case they specified an initially low fee for block inclusion. Modern EVM chains let a sender “swap” a pending transaction for a new one with the same nonce, but a higher effective gas price — it allows wallets to expose to the user an opportunity to cancel a transaction or accelerate it.
In this scenario, validators (block builders) would simply keep the highest-priced version (tx) and discard the rest, so the user can un-stick a transaction or raise its priority without waiting for time-outs. It is worth noting this mechanism only makes sense in case the previously discussed fee-ordering is present.
Mempool
interface heavily leaks implementation detailsThe current mempool interface is defined as follows:
The
Mempool
interface is referenced from modules such as consensus, state, RPC, but it heavily leaks implementation details, and forces the mempool implementation to function in a specific way that is perhaps not optimal.Locking / Unlocking
The mempool should itself protect its invariants, and not expose its atomicity to the user of the mempool (who now needs to be aware of which operations on the mempool aren’t atomic).
Methods such as
CheckTxWithInfo
lock the entire mempool internally, without expecting the caller to useLock / Unlock
:WAL management
WAL initialization and closing is exposed to the caller / user of the mempool. This initialization is done once, in fact, by the node when it’s initializing
OnStart / OnStop
— so it does not belong in the public API for all mempool users.Forced FIFO priority
The block building process calls
ReapMaxBytesMaxGas
to get the list of transactions to include within a new block. SinceReapMaxBytesMaxGas
returns the transactions that fit into the gas range, on a first-come-first-served basis:Notice that inclusion is based on user-defined gasWanted, and not by empirical data (transaction execution and validation before inclusion in a block).
Flushing app-connection
The mempool exposes an app-conn flush API that the mempool caller can invoke:
Following the implementation flow of this method leads to a no-op:
The user of the mempool should not be aware how the mempool communicates with the application, but this API exposes the design decision.
Unnecessary subscription API
The mempool exposes a top-level subscription API:
TxsAvailable
returns a shared channel, which means multiple callers (listeners) can cause event starvation. It also has coupling withEnableTxsAvailable
, causing the caller to be aware of the underlying implementation.Changing this segment of the interface is a larger change that also affects the consensus process (block building), and as such is left for discussion.
C-List
performance bottlenecksAs mentioned before, the mempool implementation has funky locking mechanics, that closes down the entire mempool, even for accounts whose transactions are not being pruned / updated. The larger the pool grows, the greater this lock contention becomes.
The mempool is implemented as a doubly linked list. The operations that are most common on the mempool:
Perhaps the most disturbing fact is that following a block commit, TM2 locks and rechecks every single tx in the pool before unlocking and allowing other parts of the system to interact with the mempool (ex. block building process that is waiting for the recheck to finish). In reality, this shouldn’t be the case because only a subset of the pool accounts / transactions have ideally been impacted with a block commit (and need to be updated), rather than the entire pool being processed all over again.
It is worth noting that given the c-list mempool is FIFO, implementing custom ordering on it is not feasible, so fee-priority transactions (as well as in-order txs) cannot be honored.
The Cosmos community outlined more problems with the C-list implementation here, which is worth reading into. These issues are still happening in 2025.
The mempool currently present in CometBFT, and in TM2 is flood-based, meaning transactions are gossiped out to peers regardless of their internal state. We discuss a DDoS protection mechanism below, in the proposal section that aims to eliminate node strain even in a gossip-based pool. Future mempool implementations should look into Bullshark, and how the DAG-based consensus layer in the mempool unlocks throughput.
Motivation
Given all that was mentioned, the intention of moving away from the c-list mempool implementation is the following:
Approaches
There have been more than a few attempts to improve the mempool implementation in Cosmos, with each attempt introducing a new spin. The scope of these changes is outside the realm of this RFC, but we leave the resource links as reference for additional study:
CheckTx
returnspriority
PrepareProposal
/ProcessProposal
(CometBFT v0.37 docs)Proposal
The proposal will cover the high-level architecture of a new mempool design, and focus on how it solves the outlined problems mentioned in the abstract.
High-level API
Pending vs. Queued Txs
There needs to be a distinction between “pending” and “queued” transactions in the mempool.
Pending transactions are transactions which are immediately “executable”, meaning they are next-in-order for the account to be executed (they match the next account sequence).
In contrast, queued transactions are transactions which cannot be executed currently, since there is a transaction with a lower sequence (nonce) value still missing from the mempool. Queued transactions can eventually become pending transactions if the nonce gap is filled; otherwise, they are “shelved” for later use and inclusion, depending on available memory capacity.
Examples:
Block Building
The block building process is straightforward with the proposed new API.
Since the mempool supports executable transactions fetches during the block building process through
Pending
, the process follows the flow:Notice that at no point does the block building process lock the entire pool to pick a transaction, meaning transactions can be coming in asynchronously into the pool.
Additionally, note that
Pending
does not remove the transactions from the pool, but just fetches executable transactions. The mempool is realigned when a block is committed, and the caller invokesUpdate
with the executed transaction.Upon updating the mempool, all “stale” transactions (transactions whose nonce is now less than the account state nonce) are pruned, as well as all committed transactions. At this point as well, transactions for a specific account can move from being “queued” to being “pending”.
Pending()
vs.Pop()
There is a nuance in the proposed
Mempool
API that's worth discussing.The proposed API doesn't include a
Pop() Tx
method, but instead relies on thePending() []Tx
&Update([]Tx)
combination (wherePending
is non-destructive).The reason for this choice is to avoid a situation where a transaction is popped from the pool during the block building process and cannot be re-added back in (due to the pools capacity being filled up for example, in the meantime).
In most blockchain clients, a transaction is dropped from the mempool only when a block is committed to the canonical chain. The only exception seems to be the re-injecting of transactions into the mempool for orphaned blocks -- this situation doesn't exist for instant-finality chains like TM2.
DDoS protection
The mempool is entirely an in-memory structure. This means that space it occupies as part of the node process is limited, and should be tightly controlled.
We propose 3 memory limits to apply to the mempool in order for memory to be controlled properly and not degrade the node performance:
60
.20
.The mempool should hold a “seen” cache / index, to know exactly which transactions it can reject in case it receives duplicates (from the RPC layer, or the p2p layer).
Mempool atomicity
The proposed mempool moves away from the double-linked list model and instead adopts a simple lookup table.
The mempool’s state can essentially be:
By having a separate lock per account, the mempool doesn’t cause contentious moments when:
The text was updated successfully, but these errors were encountered: