Skip to content

perf(l1): two-CF receipts migration#6598

Open
azteca1998 wants to merge 18 commits into
mainfrom
perf/two-cf-receipts-migration
Open

perf(l1): two-CF receipts migration#6598
azteca1998 wants to merge 18 commits into
mainfrom
perf/two-cf-receipts-migration

Conversation

@azteca1998
Copy link
Copy Markdown
Contributor

Summary

  • Replace the in-place cursor-held migration (re-key + per-key deletes on receipts) with a two-CF approach: copy entries from receipts to receipts_v2 with fixed-width keys, then the old CF gets dropped on subsequent startup
  • No per-key delete tombstones → less compaction overhead
  • Old CF dropped atomically by RocksDB (drops SST files directly, no compaction needed)
  • One read pass + one write pass, no mixed reads/writes on same CF

Benchmark (150M synthetic entries)

Approach Wall-clock time Notes
Two-CF (this PR) 88s Copy to new CF, old CF dropped on restart
Cursor-held (prev) 487s In-place re-key + per-key deletes
Temp-file dump 221s Dump keys to file, re-key + delete

Files changed

  • tables.rs — Add RECEIPTS_V2 constant, keep RECEIPTS for migration reads
  • rocksdb.rs — Switch CF options to RECEIPTS_V2
  • migrations.rs — Rewrite migrate_1_to_2 to copy to new CF (no deletes)
  • store.rs — All receipt read/write ops use RECEIPTS_V2
  • Added bench_migration.rs and seed_migration_test.rs binaries for benchmarking

Test plan

  • cargo test -p ethrex-storage --features rocksdb -- migrations passes
  • cargo clippy -p ethrex-storage --features rocksdb clean
  • cargo check (full workspace) passes
  • Benchmark on production-sized data (srv1 has 16 GB receipts CF)

… iteration

Change RECEIPTS key from RLP-encoded (BlockHash, u64) to raw
block_hash (32B) || index (8B big-endian u64). This enables
cursor-based prefix iteration by block hash, replacing the previous
point-lookup loop.

- Add receipt_key() helper for the new fixed-width key format
- Rewrite get_receipts_for_block_from_index to use prefix_iterator
- Add v1→v2 migration (batch-processes old RLP keys, crash-safe)
- Bump STORE_SCHEMA_VERSION to 2
- Remove benchmark code from the previous iteration
Switch get_all_block_rpc_receipts and get_all_block_receipts from
per-receipt point lookups to a single get_receipts_for_block() call,
which uses prefix_iterator for cursor-based batch retrieval.
The cursor-based batch retrieval is slower for RPC because iterators
bypass RocksDB block cache optimizations that point lookups benefit
from. Keep cursor iteration only for p2p (get_receipts_for_block),
restore per-receipt get() for the RPC handlers.
eth_getTransactionReceipt previously fetched ALL receipts for a block
(N point lookups) just to return one. Now uses cursor iteration with
a max_count limit to fetch only receipts 0..=index, stopping the
cursor early. For a block with 200 txs and a target at index 10,
this fetches 11 receipts instead of 200.

- Add max_count parameter to get_receipts_for_block_from_index
- Add target_index parameter to get_all_block_rpc_receipts
- eth_getTransactionReceipt passes Some(index) to stop early
- eth_getBlockReceipts passes None to fetch all
- get_all_block_receipts uses cursor for raw receipt retrieval
Without a RocksDB prefix extractor, prefix_iterator_cf seeks to the
correct position but doesn't stop at the prefix boundary. The loop
was iterating through the entire remaining TRANSACTION_LOCATIONS table
after finding the match, causing eth_getTransactionReceipt to take
seconds instead of milliseconds.
- Skip RECEIPTS entries with unexpected key lengths instead of
  attempting to decode them
- Add receipt count validation in get_all_block_rpc_receipts so
  missing receipts produce an error instead of a silent empty result
- Add guard in eth_getTransactionReceipt for short receipt lists
- Remove redundant receipt length check in eth_getTransactionReceipt
  (already validated upstream in get_all_block_rpc_receipts)
- Restructure migrate_1_to_2 to materialize old keys before writing,
  avoiding dependency on snapshot semantics during concurrent read/write
- Add test for migrate_1_to_2 that seeds old RLP keys, runs migration,
  and verifies new fixed-width keys round-trip correctly
The materialize-first approach loaded all 153M old-format keys + values
into a Vec (~13 GB) before re-keying. Replace with a two-phase approach:

1. Cursor scan dumps only old-format keys to a length-prefixed binary
   temp file in the DB directory, then closes the iterator immediately.
2. Keys are read back in batches of 10K; each batch does point lookups
   for values, writes new keys, and deletes old keys.

This avoids both the memory spike (only one 10K batch in memory at a
time) and the snapshot semantics concern (no concurrent read iterator
and writes on the same column family).

Crash safety is preserved: metadata stays at v1 until the migration
completes, so an interrupted migration restarts from scratch. Point
lookups for already-deleted keys return None and are skipped.
Replace the two-phase temp-file migration with a single-pass
cursor-held approach. The RocksDB iterator holds a snapshot so
it sees a consistent pre-migration view; writes are accumulated
in batches of 10K and flushed without closing the iterator.

This avoids both the temp file on disk and materializing all
keys in memory, while keeping the code simpler.
Replace the in-place cursor-held migration (re-key + per-key deletes on
`receipts`) with a two-CF approach: copy entries from `receipts` to
`receipts_v2` with fixed-width keys, then the old CF gets dropped on
subsequent startup.

Advantages over cursor-held approach:
- No per-key delete tombstones (saves 150M deletes → less compaction)
- Old CF dropped atomically by RocksDB (drops SST files directly)
- One read pass + one write pass, no mixed reads/writes on same CF

Benchmark (150M synthetic entries):
- Two-CF (this PR):     88s,  4.6 GB peak RSS
- Cursor-held (prev):  487s
- Temp-file dump:      221s

Also adds seed_migration_test and bench_migration binaries for
reproducible migration benchmarking.
@azteca1998 azteca1998 changed the title perf(storage): two-CF receipts migration (receipts → receipts_v2) perf(l1): two-CF receipts migration May 13, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 13, 2026

Lines of code report

Total lines added: 420
Total lines removed: 1
Total lines changed: 421

Detailed view
+-------------------------------------------------+-------+------+
| File                                            | Lines | Diff |
+-------------------------------------------------+-------+------+
| ethrex/crates/networking/rpc/eth/block.rs       | 371   | +6   |
+-------------------------------------------------+-------+------+
| ethrex/crates/networking/rpc/eth/transaction.rs | 576   | -1   |
+-------------------------------------------------+-------+------+
| ethrex/crates/storage/api/tables.rs             | 24    | +1   |
+-------------------------------------------------+-------+------+
| ethrex/crates/storage/bench_migration.rs        | 75    | +75  |
+-------------------------------------------------+-------+------+
| ethrex/crates/storage/migrations.rs             | 167   | +89  |
+-------------------------------------------------+-------+------+
| ethrex/crates/storage/seed_migration_test.rs    | 218   | +218 |
+-------------------------------------------------+-------+------+
| ethrex/crates/storage/store.rs                  | 2687  | +31  |
+-------------------------------------------------+-------+------+

@azteca1998 azteca1998 marked this pull request as ready for review May 13, 2026 11:50
@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

Now I have enough to write a thorough review. Let me compile the findings.


PR #6598 Review: perf(l1): two-CF receipts migration

Overall this is a well-thought-out performance change. The two-CF approach is clearly faster than alternatives, the crash-safety reasoning is sound, and the get_transaction_location prefix-scan fix is a genuine correctness improvement. A few issues need addressing before merge.


Critical

RECEIPTS CF is never dropped — contradicts PR description

RECEIPTS is still present in the TABLES array after this PR (tables.rs, the pub const TABLES: [&str; 20] array). RocksDB's auto-cleanup in RocksDBBackend::open() only drops CFs that are absent from TABLES. Because RECEIPTS remains there, the old CF — potentially gigabytes — will never be dropped automatically, not even on subsequent restarts.

The PR description states "Old CF dropped atomically by RocksDB (drops SST files directly, no compaction needed)" and "old CF dropped on restart," but that cannot happen in this commit. Either:

  • RECEIPTS must be removed from TABLES in this PR (requiring a second migration step or a startup-time drop), or
  • the description must be corrected and a follow-up issue opened to track the removal.

Until that happens, every node that migrates will silently carry both CFs forever, undoing the disk-space benefit of the migration.


Medium

Linear scan on start_index in get_receipts_for_block_from_index — O(start_index) I/O

store.rs (the updated get_receipts_for_block_from_index):

if start_index > 0 {
    let idx = u64::from_be_bytes(...);
    if idx < start_index {
        continue;          // reads and discards every entry 0..start_index
    }
}

With the new fixed-width key block_hash (32B) || index (8B BE), keys are lexicographically ordered, so a single seek directly to block_hash || start_index.to_be_bytes() would skip over all prior entries in O(log N). The current approach reads and discards every entry below start_index, which is a regression for the p2p eth/70 path (EIP-7975) when start_index is large.

Suggestion: expose a seek on the iterator (or use a separate lower-bound key), and drop the continue branch entirely.


Open RocksDB snapshot held for the entire migration — disk pressure

migrations.rs, migrate_1_to_2:

let txn = backend.begin_read()?;     // snapshot taken here
let iter = txn.prefix_iterator(RECEIPTS, &[])?;
...
// write transactions committed in the loop while txn stays open

Holding a snapshot open for ~88 s (or much longer on a production-sized 16 GB CF) prevents RocksDB from releasing old SST files via compaction. For nodes with limited disk, this can cause the available space to temporarily drop below what is needed. Consider either (a) checkpointing the progress and re-acquiring the read snapshot periodically, or (b) documenting the extra disk headroom required in the migration warning log.


Low / Informational

Changelog PR number mismatch

CHANGELOG.md attributes this change to #6548, but this PR is #6598. Should be corrected.


libc and tracing-subscriber added as main (non-dev) dependencies

storage/Cargo.toml:

[dependencies]
libc = "0.2"
tracing-subscriber = { workspace = true }

These are only used by the two benchmark binaries (bench_migration, seed_migration_test). Binary targets in [[bin]] sections cannot use [dev-dependencies], so the placement is technically required. However, tracing-subscriber is a non-trivial transitive dependency pulled into every consumer of the storage crate.

Consider gating these binaries (and their deps) behind a dedicated Cargo feature (e.g., migration-tools) so that library users and production builds don't pay the cost.


Redundant .take(fetch_count) in get_all_block_rpc_receipts

rpc/eth/block.rs:

if all_receipts.len() != fetch_count {
    return Err(...);
}
...
.zip(all_receipts.iter())
.enumerate()
.take(fetch_count)   // redundant: zip already stops at min(txs.len(), receipts.len())
                     // and receipts.len() == fetch_count has just been verified

The .take(fetch_count) is a no-op after the length assertion passes and can be removed.


Positive notes

  • The two-CF migration strategy (88 s vs 487 s) is well-justified with benchmarks, and crash-safety via metadata versioning is handled correctly.
  • spawning_blocking for the cursor-based receipt retrieval is the right approach to avoid blocking the async runtime.
  • The get_transaction_location prefix-scan fix (break instead of continue) is a genuine correctness improvement for databases without a configured prefix extractor.
  • receipts.last() in transaction.rs is cleaner than the previous index-based .get(index as usize) for the single-receipt lookup.
  • The migration test (migrate_1_to_2_converts_rlp_keys_to_fixed_width) is thorough and validates both the new keys and the preservation of old keys.

Automated review by Claude (Anthropic) · sonnet · custom prompt

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 13, 2026

Greptile Summary

This PR replaces the in-place cursor-held receipts migration with a two-CF strategy: old RLP-keyed entries from receipts are copied to a new receipts_v2 CF with fixed-width block_hash (32B) || index (8B BE) keys, eliminating per-key delete tombstones and reducing compaction overhead significantly (88s vs 487s on 150M entries).

  • migrations.rs: adds migrate_1_to_2 that batch-copies entries in groups of 10 000, crash-safe via idempotent puts; schema version bumped to 2 in lib.rs.
  • store.rs / rocksdb.rs: all receipt read/write paths switched to RECEIPTS_V2; get_receipts_for_block_from_index moved to spawn_blocking with prefix-iterator-based batch retrieval and an optional max_count cap.
  • block.rs / transaction.rs: get_all_block_rpc_receipts refactored to accept target_index so eth_getTransactionReceipt fetches only the receipts it needs rather than the entire block's set.

Confidence Score: 3/5

The migration logic is crash-safe and the RPC changes are correct, but the old receipts CF will persist indefinitely on disk because RECEIPTS was not removed from TABLES.

The migration logic and RPC refactoring are sound, but the old receipts CF is never actually dropped because RECEIPTS was left in TABLES, meaning up to 16 GB of data on production will not be reclaimed — directly contradicting the PR's stated goal and both in-code comments that describe the drop behaviour.

crates/storage/api/tables.rsRECEIPTS must be removed from the TABLES array for the auto-cleanup drop to work as documented.

Important Files Changed

Filename Overview
crates/storage/api/tables.rs Adds RECEIPTS_V2 constant and doc comment; RECEIPTS incorrectly left in TABLES, preventing the old CF from ever being dropped by the cleanup logic.
crates/storage/migrations.rs Adds migrate_1_to_2 that copies old RLP-keyed receipts to receipts_v2 in batches; crash-safe and idempotent, but doc-comment claims RECEIPTS is removed from TABLES when it is not.
crates/storage/backend/rocksdb.rs Switches special CF options and compressible table list from RECEIPTS to RECEIPTS_V2; cleanup drop loop is correct but never fires for RECEIPTS because it is still in TABLES.
crates/storage/store.rs All receipt read/write paths switched to RECEIPTS_V2 with new fixed-width key; get_receipts_for_block_from_index correctly moved to spawn_blocking but uses a linear prefix scan instead of a seek for start_index > 0.
crates/networking/rpc/eth/block.rs Refactors get_all_block_rpc_receipts to accept an optional target_index and batch-fetch receipts via prefix iteration; receipt-count validation added; changes look correct.
crates/networking/rpc/eth/transaction.rs Uses receipts.last() instead of receipts.get(index as usize) — functionally equivalent given the count check but fragile.
crates/storage/bench_migration.rs New benchmark binary for manual migration timing; not part of the production path.
crates/storage/seed_migration_test.rs New seeding binary that writes 150M synthetic old-format RLP receipt entries for benchmarking; not part of normal operation.

Sequence Diagram

sequenceDiagram
    participant Node as ethrex Node
    participant RDB as RocksDBBackend::open()
    participant Mig as migrations::run_pending_migrations()
    participant OLD as receipts CF (v1)
    participant NEW as receipts_v2 CF (v2)

    Node->>RDB: open(path)
    RDB->>RDB: union(existing_cfs, TABLES) to all_cfs_to_open
    Note over RDB: Both receipts and receipts_v2 opened
    RDB->>RDB: cleanup drop CFs not in TABLES
    Note over RDB: receipts IS in TABLES so NOT dropped

    Node->>Mig: run_pending_migrations(backend, path, v1)
    loop batch of 10 000
        Mig->>OLD: prefix_iterator read all entries
        OLD-->>Mig: (rlp_key, value)
        Mig->>Mig: decode H256 u64 build fixed-width key
        Mig->>NEW: put_batch(new_key, value)
        Mig->>Mig: write_metadata(version + 1)
    end
    Mig-->>Node: "Ok schema_version=2"

    Note over OLD: Still alive never dropped

    Node->>Node: Normal operation reads/writes to receipts_v2 only
Loading

Comments Outside Diff (1)

  1. crates/storage/api/tables.rs, line 110-131 (link)

    P1 RECEIPTS left in TABLES — old CF is never dropped

    RECEIPTS is still in the TABLES array at line 120. The cleanup loop in RocksDBBackend::open() only drops column families whose names are not in TABLES:

    if cf_name != "default" && !TABLES.contains(&cf_name.as_str()) { drop(cf) }
    

    Because "receipts" remains in TABLES, this condition is always false and the old CF is silently kept alive across every restart — the 16 GB of old-format data on production will never be reclaimed. The doc-comment on RECEIPTS ("dropped automatically on next startup") and the migration function comment ("since RECEIPTS is no longer listed in TABLES") both contradict the actual code.

    The fix is to remove RECEIPTS from TABLES. The migration can still access the CF during migration because RocksDBBackend::open() unions existing_cfs with TABLES before opening, so a pre-migration database's "receipts" CF will still be opened from existing_cfs. On the startup after the successful migration (schema v2), "receipts" will be in existing_cfs but no longer in TABLES, and the cleanup loop will drop it.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: crates/storage/api/tables.rs
    Line: 110-131
    
    Comment:
    **`RECEIPTS` left in `TABLES` — old CF is never dropped**
    
    `RECEIPTS` is still in the `TABLES` array at line 120. The cleanup loop in `RocksDBBackend::open()` only drops column families whose names are **not** in `TABLES`:
    
    ```
    if cf_name != "default" && !TABLES.contains(&cf_name.as_str()) { drop(cf) }
    ```
    
    Because `"receipts"` remains in `TABLES`, this condition is always `false` and the old CF is silently kept alive across every restart — the 16 GB of old-format data on production will never be reclaimed. The doc-comment on `RECEIPTS` ("dropped automatically on next startup") and the migration function comment ("since `RECEIPTS` is no longer listed in `TABLES`") both contradict the actual code.
    
    The fix is to remove `RECEIPTS` from `TABLES`. The migration can still access the CF during migration because `RocksDBBackend::open()` unions `existing_cfs` with `TABLES` before opening, so a pre-migration database's `"receipts"` CF will still be opened from `existing_cfs`. On the startup after the successful migration (schema v2), `"receipts"` will be in `existing_cfs` but no longer in `TABLES`, and the cleanup loop will drop it.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
crates/storage/api/tables.rs:110-131
**`RECEIPTS` left in `TABLES` — old CF is never dropped**

`RECEIPTS` is still in the `TABLES` array at line 120. The cleanup loop in `RocksDBBackend::open()` only drops column families whose names are **not** in `TABLES`:

```
if cf_name != "default" && !TABLES.contains(&cf_name.as_str()) { drop(cf) }
```

Because `"receipts"` remains in `TABLES`, this condition is always `false` and the old CF is silently kept alive across every restart — the 16 GB of old-format data on production will never be reclaimed. The doc-comment on `RECEIPTS` ("dropped automatically on next startup") and the migration function comment ("since `RECEIPTS` is no longer listed in `TABLES`") both contradict the actual code.

The fix is to remove `RECEIPTS` from `TABLES`. The migration can still access the CF during migration because `RocksDBBackend::open()` unions `existing_cfs` with `TABLES` before opening, so a pre-migration database's `"receipts"` CF will still be opened from `existing_cfs`. On the startup after the successful migration (schema v2), `"receipts"` will be in `existing_cfs` but no longer in `TABLES`, and the cleanup loop will drop it.

### Issue 2 of 3
crates/networking/rpc/eth/transaction.rs:310
`receipts.last()` is correct only as long as `fetch_count == index + 1` and the count check above enforces that exact length. Using `receipts.get(index as usize)` is an explicit, index-stable access that doesn't silently return the wrong receipt if the count logic is ever adjusted.

```suggestion
        serde_json::to_value(receipts.get(index as usize)).map_err(|error| RpcErr::Internal(error.to_string()))
```

### Issue 3 of 3
crates/storage/store.rs:1098-1136
**EIP-7975 `start_index` performs a full linear scan from index 0**

For the EIP-7975 partial-receipt path, `start_index` can be a large offset into the block. The current implementation issues a `prefix_iterator` from the beginning of the block-hash prefix and then iterates + skips every entry with `idx < start_index`. Because keys are stored as big-endian u64, they are lexicographically sorted, so a `seek` directly to the composite key `block_hash || start_index.to_be_bytes()` would jump straight to the target. With the current linear scan, a request for receipts starting at index N always reads the first N entries before returning any results.

Reviews (1): Last reviewed commit: "fix: update stale Cargo.lock files after..." | Re-trigger Greptile


serde_json::to_value(receipts.get(index as usize))
.map_err(|error| RpcErr::Internal(error.to_string()))
serde_json::to_value(receipts.last()).map_err(|error| RpcErr::Internal(error.to_string()))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 receipts.last() is correct only as long as fetch_count == index + 1 and the count check above enforces that exact length. Using receipts.get(index as usize) is an explicit, index-stable access that doesn't silently return the wrong receipt if the count logic is ever adjusted.

Suggested change
serde_json::to_value(receipts.last()).map_err(|error| RpcErr::Internal(error.to_string()))
serde_json::to_value(receipts.get(index as usize)).map_err(|error| RpcErr::Internal(error.to_string()))
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/networking/rpc/eth/transaction.rs
Line: 310

Comment:
`receipts.last()` is correct only as long as `fetch_count == index + 1` and the count check above enforces that exact length. Using `receipts.get(index as usize)` is an explicit, index-stable access that doesn't silently return the wrong receipt if the count logic is ever adjusted.

```suggestion
        serde_json::to_value(receipts.get(index as usize)).map_err(|error| RpcErr::Internal(error.to_string()))
```

How can I resolve this? If you propose a fix, please make it concise.

Comment thread crates/storage/store.rs
Comment on lines +1098 to 1136
let block_hash = *block_hash;

let txn = self.backend.begin_read()?;
loop {
let key = (*block_hash, index).encode_to_vec();
match txn.get(RECEIPTS, key.as_slice())? {
Some(receipt_bytes) => {
let receipt = Receipt::decode(receipt_bytes.as_slice())?;
receipts.push(receipt);
index += 1;
tokio::task::spawn_blocking(move || {
let txn = backend.begin_read()?;
let prefix = block_hash.as_bytes().to_vec();
let iter = txn.prefix_iterator(RECEIPTS_V2, &prefix)?;
let mut receipts = Vec::new();
for result in iter {
let (k, v) = result?;
if !k.starts_with(&prefix) {
break;
}
if k.len() != 40 {
continue;
}
// Skip entries before start_index (for eth/70 partial requests)
if start_index > 0 {
let idx_bytes: [u8; 8] = k[32..40]
.try_into()
.expect("slice is exactly 8 bytes (checked k.len() == 40)");
let idx = u64::from_be_bytes(idx_bytes);
if idx < start_index {
continue;
}
}
receipts.push(Receipt::decode(v.as_ref())?);
if let Some(max) = max_count
&& receipts.len() >= max
{
break;
}
None => break,
}
}

Ok(receipts)
Ok(receipts)
})
.await
.map_err(|e| StoreError::Custom(format!("Task panicked: {e}")))?
}

// Snap State methods
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 EIP-7975 start_index performs a full linear scan from index 0

For the EIP-7975 partial-receipt path, start_index can be a large offset into the block. The current implementation issues a prefix_iterator from the beginning of the block-hash prefix and then iterates + skips every entry with idx < start_index. Because keys are stored as big-endian u64, they are lexicographically sorted, so a seek directly to the composite key block_hash || start_index.to_be_bytes() would jump straight to the target. With the current linear scan, a request for receipts starting at index N always reads the first N entries before returning any results.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/store.rs
Line: 1098-1136

Comment:
**EIP-7975 `start_index` performs a full linear scan from index 0**

For the EIP-7975 partial-receipt path, `start_index` can be a large offset into the block. The current implementation issues a `prefix_iterator` from the beginning of the block-hash prefix and then iterates + skips every entry with `idx < start_index`. Because keys are stored as big-endian u64, they are lexicographically sorted, so a `seek` directly to the composite key `block_hash || start_index.to_be_bytes()` would jump straight to the target. With the current linear scan, a request for receipts starting at index N always reads the first N entries before returning any results.

How can I resolve this? If you propose a fix, please make it concise.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 13, 2026

Benchmark Results Comparison

No significant difference was registered for any benchmark run.

Detailed Results

Benchmark Results: BubbleSort

Command Mean [s] Min [s] Max [s] Relative
main_revm_BubbleSort 2.997 ± 0.034 2.948 3.057 1.10 ± 0.02
main_levm_BubbleSort 2.716 ± 0.030 2.687 2.787 1.00
pr_revm_BubbleSort 3.031 ± 0.031 2.991 3.103 1.12 ± 0.02
pr_levm_BubbleSort 2.723 ± 0.018 2.701 2.748 1.00 ± 0.01

Benchmark Results: ERC20Approval

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_ERC20Approval 991.0 ± 9.5 978.9 1012.5 1.00 ± 0.01
main_levm_ERC20Approval 1037.0 ± 7.7 1029.0 1051.0 1.05 ± 0.01
pr_revm_ERC20Approval 987.6 ± 6.5 983.6 1005.3 1.00
pr_levm_ERC20Approval 1031.6 ± 11.5 1016.0 1057.0 1.04 ± 0.01

Benchmark Results: ERC20Mint

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_ERC20Mint 133.5 ± 0.9 132.3 135.5 1.00
main_levm_ERC20Mint 157.7 ± 0.5 157.0 158.5 1.18 ± 0.01
pr_revm_ERC20Mint 134.1 ± 0.6 133.5 135.3 1.00 ± 0.01
pr_levm_ERC20Mint 156.9 ± 3.0 155.2 165.4 1.17 ± 0.02

Benchmark Results: ERC20Transfer

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_ERC20Transfer 232.8 ± 1.5 231.6 236.1 1.00
main_levm_ERC20Transfer 259.8 ± 2.3 257.5 264.4 1.12 ± 0.01
pr_revm_ERC20Transfer 235.1 ± 1.8 232.2 237.8 1.01 ± 0.01
pr_levm_ERC20Transfer 258.8 ± 4.3 254.8 270.4 1.11 ± 0.02

Benchmark Results: Factorial

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_Factorial 226.4 ± 1.4 225.4 230.2 1.00
main_levm_Factorial 245.7 ± 5.0 242.2 259.6 1.08 ± 0.02
pr_revm_Factorial 229.2 ± 0.7 227.6 229.9 1.01 ± 0.01
pr_levm_Factorial 247.2 ± 11.8 241.5 280.6 1.09 ± 0.05

Benchmark Results: FactorialRecursive

Command Mean [s] Min [s] Max [s] Relative
main_revm_FactorialRecursive 1.635 ± 0.037 1.576 1.684 1.03 ± 0.02
main_levm_FactorialRecursive 1.580 ± 0.009 1.567 1.591 1.00
pr_revm_FactorialRecursive 1.656 ± 0.039 1.607 1.697 1.05 ± 0.03
pr_levm_FactorialRecursive 1.585 ± 0.009 1.574 1.602 1.00 ± 0.01

Benchmark Results: Fibonacci

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_Fibonacci 206.9 ± 0.7 205.8 208.5 1.01 ± 0.00
main_levm_Fibonacci 229.3 ± 17.8 216.0 277.0 1.12 ± 0.09
pr_revm_Fibonacci 205.1 ± 0.7 203.4 205.7 1.00
pr_levm_Fibonacci 219.0 ± 4.4 216.0 228.5 1.07 ± 0.02

Benchmark Results: FibonacciRecursive

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_FibonacciRecursive 864.8 ± 12.7 848.0 887.1 1.25 ± 0.03
main_levm_FibonacciRecursive 690.4 ± 9.8 679.8 705.2 1.00
pr_revm_FibonacciRecursive 860.3 ± 6.6 849.5 870.0 1.25 ± 0.02
pr_levm_FibonacciRecursive 691.8 ± 14.5 678.1 720.4 1.00 ± 0.03

Benchmark Results: ManyHashes

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_ManyHashes 8.8 ± 0.8 8.4 10.9 1.04 ± 0.09
main_levm_ManyHashes 9.7 ± 0.1 9.6 9.9 1.16 ± 0.01
pr_revm_ManyHashes 8.4 ± 0.1 8.4 8.5 1.00
pr_levm_ManyHashes 9.7 ± 0.1 9.6 10.0 1.15 ± 0.01

Benchmark Results: MstoreBench

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_MstoreBench 259.5 ± 5.0 257.0 273.4 1.14 ± 0.02
main_levm_MstoreBench 229.6 ± 2.1 226.8 233.7 1.00 ± 0.01
pr_revm_MstoreBench 263.5 ± 6.0 258.4 274.2 1.15 ± 0.03
pr_levm_MstoreBench 228.5 ± 1.1 227.6 231.1 1.00

Benchmark Results: Push

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_Push 296.2 ± 17.3 288.7 344.9 1.08 ± 0.06
main_levm_Push 273.7 ± 2.4 271.6 278.6 1.00
pr_revm_Push 290.9 ± 1.4 289.4 293.8 1.06 ± 0.01
pr_levm_Push 274.5 ± 1.9 271.8 277.4 1.00 ± 0.01

Benchmark Results: SstoreBench_no_opt

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_SstoreBench_no_opt 167.9 ± 5.3 163.7 182.3 1.66 ± 0.08
main_levm_SstoreBench_no_opt 101.3 ± 3.3 99.9 110.6 1.00
pr_revm_SstoreBench_no_opt 170.3 ± 11.3 163.1 201.6 1.68 ± 0.12
pr_levm_SstoreBench_no_opt 101.7 ± 1.6 99.8 104.5 1.00 ± 0.04

PENDING_BLOCKS,
TRANSACTION_LOCATIONS,
RECEIPTS,
RECEIPTS_V2,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old RECEIPTS CF won't auto-drop. The migration doc (migrations.rs:104-106) and the PR description both say:

"the old receipts CF is not deleted here — it will be dropped automatically by the auto-cleanup in RocksDBBackend::open() on the next startup (since RECEIPTS is no longer listed in TABLES)."

But RECEIPTS IS still in TABLES (line 120 above this addition). The auto-cleanup at backend/rocksdb.rs only drops CFs whose name is NOT in TABLES:

for cf_name in &existing_cfs {
    if cf_name != "default" && !TABLES.contains(&cf_name.as_str()) {
        warn!("Dropping obsolete column family: {}", cf_name);
        let _ = db.drop_cf(cf_name) ...
    }
}

So with both RECEIPTS and RECEIPTS_V2 in TABLES, the old CF survives forever. After migration both CFs hold the same data — on srv1's 16 GB receipts CF, that's 16 GB of duplicate state that never goes away.

The test at migrations.rs:250-254 confirms the intent ("Old keys should still be in RECEIPTS (dropped at startup)") but doesn't actually verify the drop happens.

Fix: remove RECEIPTS, from the TABLES array (line 120). The migration code still references it via tables::RECEIPTS for read-only reading, which is fine — prefix_iterator against a CF name that's been opened (via the auto-create path) works regardless of whether the constant is in TABLES. The auto-cleanup loop will then see "receipts" is unlisted and call drop_cf on next startup.

Worth a quick test of this scenario before merge — e.g., a unit test that runs migrate + reopens the backend + asserts receipts CF is gone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants