perf(l1): bal optimistic merkleization on validation path#6655
Conversation
|
Lines of code reportTotal lines added: Detailed view |
cea052f to
815f446
Compare
Benchmark Block Execution Results Comparison Against Main
|
ce666ea to
01db258
Compare
Decouple merkleization from EVM execution when the validation path receives a BAL: synthesize per-field deltas from the input BlockAccessList pre-execution and run merkle stages B/C/D in parallel with execution + warming. validate_block_access_list_hash remains the post-execution correctness gate. Closes #6584.
01db258 to
43ae7c5
Compare
🤖 Codex Code ReviewFindings
No other consensus/gas-accounting problems stood out in the touched logic. I did not run the test suite; this is a static review of the diff. Automated review by OpenAI Codex · gpt-5.4 · custom prompt |
🤖 Claude Code ReviewNow I have a complete picture of the PR. Let me write the review. PR Review:
|
Greptile SummaryThis PR decouples BAL-merkleization from EVM execution on the
Confidence Score: 4/5Safe to merge for the validation path; correctness is gated by the existing post-execution BAL hash check and the state-root comparison against the block header, so an incorrect optimistic trie results in block rejection rather than silent state corruption. The redesign is mechanically sound: per-field optionals avoid zero-fabrication for unchanged account fields, EIP-161 normalization in Stage C correctly handles account deletion without needing an explicit removed flag, and generate_witness_from_account_updates only reads address plus added_storage.keys() so the stripped optimistic_witness is complete for its purpose. Minor structural issues remain: BalSynthesisItem.address is stored but never read after construction, BalStateWorkItem.removed is an always-false dead branch, and the parallel pre-warm opens one state_trie handle per account rather than per thread. The pre-warm block in handle_merkleization_bal_from_updates (blockchain.rs) and the BalSynthesisItem struct definition (block_access_list.rs) deserve a second look.
|
| Filename | Overview |
|---|---|
| crates/blockchain/blockchain.rs | Core change: pre-scope BAL synthesis, optional channel, handle_merkleization_bal_from_updates with parallel pre-warm; BalStateWorkItem.removed is now always false (dead branch) and timing array expanded from 6 to 7 instants in the pipeline result. |
| crates/common/types/block_access_list.rs | New BalSynthesisItem struct and synthesize_bal_updates function with 13 unit tests; BalSynthesisItem.address field is redundant (duplicates the map key). |
| crates/vm/backends/levm/mod.rs | merkleizer: Sender to Option<Sender>; BAL path skips bal_to_account_updates + send; sequential path uses expect() to enforce the invariant that non-BAL callers always provide a Sender. |
| crates/vm/backends/mod.rs | One-line signature change: merkleizer: Sender to Option<Sender> to match LEVM. |
| crates/common/types/mod.rs | Re-exports BalSynthesisItem and synthesize_bal_updates from block_access_list. |
Sequence Diagram
sequenceDiagram
participant Main as Main Thread
participant Exec as EVM Exec Thread
participant Merkle as Merkleizer Thread
participant Warm as Warmer Thread
Note over Main: exec_merkle_start captured
Main->>Main: synthesize_bal_updates(bal)
Main->>Main: build optimistic_witness
Note over Main: thread::scope entered
par Parallel threads
Main->>Exec: "spawn merkleizer=None on BAL path"
Main->>Merkle: "spawn prepared=optimistic_updates"
Main->>Warm: spawn
end
Note over Merkle: merkle_start_instant captured
Merkle->>Merkle: par_iter pre-warm open_state_trie per account
Merkle->>Merkle: Stage B parallel storage root computation
Merkle->>Merkle: Stage C per-shard state trie updates EIP-161
Merkle->>Merkle: Stage D finalize root state_trie_hash
Note over Merkle: merkle_end_instant captured
Exec->>Exec: execute_block_parallel skip bal_to_account_updates
Exec->>Exec: validate_block_access_list_hash
Note over Exec: exec_end_instant captured
Main->>Main: accumulated_updates optimistic_witness OR streaming_witness
Main->>Main: verify state_trie_hash matches block header
Note over Main: exec_merkle_end_instant captured
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 3
crates/common/types/block_access_list.rs:1600-1608
The `address` field duplicates the map key in `FxHashMap<Address, BalSynthesisItem>`. In `handle_merkleization_bal_from_updates` the key `addr` is always used instead of `item.address`, and in `optimistic_witness` `*addr` is used too — so this field is written but never read. Removing it saves 20 bytes per account in the synthesized map.
```suggestion
#[derive(Debug, Clone, Default)]
pub struct BalSynthesisItem {
pub balance: Option<U256>,
pub nonce: Option<u64>,
pub code_hash: Option<H256>,
pub code: Option<Code>,
pub added_storage: FxHashMap<H256, U256>,
}
```
### Issue 2 of 3
crates/blockchain/blockchain.rs:319-326
On the BAL synthesis path `removed` is hardcoded to `false` and `handle_merkleization_bal_from_updates` never sets it. The `if item.removed { account_state = AccountState::default(); }` branch in the Stage C worker is therefore unreachable on this path. Dropping the field from the struct would make this invariant visible and avoid confusion for future readers.
```suggestion
struct BalStateWorkItem {
hashed_address: H256,
nonce: Option<u64>,
balance: Option<U256>,
code_hash: Option<H256>,
/// Pre-computed storage root from Stage B, or None to keep existing.
storage_root: Option<H256>,
```
### Issue 3 of 3
crates/blockchain/blockchain.rs:921-932
**Pre-warm opens one `state_trie` handle per account**
The rayon closure calls `open_state_trie(parent_state_root)` for each of the N accounts, creating N independent trie handles before Stage B opens another 16 (one per worker). If `open_state_trie` acquires any internal lock, allocates a per-handle arena, or does RocksDB snapshot work, N parallel opens could serialise or spike allocation far beyond what Stage B already requires. Worth verifying that `open_state_trie` is a lightweight read-view wrapper, or considering a single trie opened per rayon thread rather than per item.
Reviews (1): Last reviewed commit: "perf(l1): BAL optimistic merkleization o..." | Re-trigger Greptile
| #[derive(Debug, Clone, Default)] | ||
| pub struct BalSynthesisItem { | ||
| pub address: Address, | ||
| pub balance: Option<U256>, | ||
| pub nonce: Option<u64>, | ||
| pub code_hash: Option<H256>, | ||
| pub code: Option<Code>, | ||
| pub added_storage: FxHashMap<H256, U256>, | ||
| } |
There was a problem hiding this comment.
The
address field duplicates the map key in FxHashMap<Address, BalSynthesisItem>. In handle_merkleization_bal_from_updates the key addr is always used instead of item.address, and in optimistic_witness *addr is used too — so this field is written but never read. Removing it saves 20 bytes per account in the synthesized map.
| #[derive(Debug, Clone, Default)] | |
| pub struct BalSynthesisItem { | |
| pub address: Address, | |
| pub balance: Option<U256>, | |
| pub nonce: Option<u64>, | |
| pub code_hash: Option<H256>, | |
| pub code: Option<Code>, | |
| pub added_storage: FxHashMap<H256, U256>, | |
| } | |
| #[derive(Debug, Clone, Default)] | |
| pub struct BalSynthesisItem { | |
| pub balance: Option<U256>, | |
| pub nonce: Option<u64>, | |
| pub code_hash: Option<H256>, | |
| pub code: Option<Code>, | |
| pub added_storage: FxHashMap<H256, U256>, | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/common/types/block_access_list.rs
Line: 1600-1608
Comment:
The `address` field duplicates the map key in `FxHashMap<Address, BalSynthesisItem>`. In `handle_merkleization_bal_from_updates` the key `addr` is always used instead of `item.address`, and in `optimistic_witness` `*addr` is used too — so this field is written but never read. Removing it saves 20 bytes per account in the synthesized map.
```suggestion
#[derive(Debug, Clone, Default)]
pub struct BalSynthesisItem {
pub balance: Option<U256>,
pub nonce: Option<u64>,
pub code_hash: Option<H256>,
pub code: Option<Code>,
pub added_storage: FxHashMap<H256, U256>,
}
```
How can I resolve this? If you propose a fix, please make it concise.| struct BalStateWorkItem { | ||
| hashed_address: H256, | ||
| info: Option<AccountInfo>, | ||
| nonce: Option<u64>, | ||
| balance: Option<U256>, | ||
| code_hash: Option<H256>, | ||
| removed: bool, | ||
| /// Pre-computed storage root from Stage B, or None to keep existing. | ||
| storage_root: Option<H256>, |
There was a problem hiding this comment.
On the BAL synthesis path
removed is hardcoded to false and handle_merkleization_bal_from_updates never sets it. The if item.removed { account_state = AccountState::default(); } branch in the Stage C worker is therefore unreachable on this path. Dropping the field from the struct would make this invariant visible and avoid confusion for future readers.
| struct BalStateWorkItem { | |
| hashed_address: H256, | |
| info: Option<AccountInfo>, | |
| nonce: Option<u64>, | |
| balance: Option<U256>, | |
| code_hash: Option<H256>, | |
| removed: bool, | |
| /// Pre-computed storage root from Stage B, or None to keep existing. | |
| storage_root: Option<H256>, | |
| struct BalStateWorkItem { | |
| hashed_address: H256, | |
| nonce: Option<u64>, | |
| balance: Option<U256>, | |
| code_hash: Option<H256>, | |
| /// Pre-computed storage root from Stage B, or None to keep existing. | |
| storage_root: Option<H256>, |
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/blockchain/blockchain.rs
Line: 319-326
Comment:
On the BAL synthesis path `removed` is hardcoded to `false` and `handle_merkleization_bal_from_updates` never sets it. The `if item.removed { account_state = AccountState::default(); }` branch in the Stage C worker is therefore unreachable on this path. Dropping the field from the struct would make this invariant visible and avoid confusion for future readers.
```suggestion
struct BalStateWorkItem {
hashed_address: H256,
nonce: Option<u64>,
balance: Option<U256>,
code_hash: Option<H256>,
/// Pre-computed storage root from Stage B, or None to keep existing.
storage_root: Option<H256>,
```
How can I resolve this? If you propose a fix, please make it concise.| // Warm parent state-trie pages for all touched accounts in parallel before | ||
| // Stage B / Stage C race for them. This replaces the prefetch that the old | ||
| // streaming path got for free via `bal_to_account_updates`. | ||
| accounts | ||
| .par_iter() | ||
| .try_for_each(|(hashed_address, _)| -> Result<(), StoreError> { | ||
| let state_trie = self.storage.open_state_trie(parent_state_root)?; | ||
| let _ = state_trie.get(hashed_address.as_bytes())?; | ||
| Ok(()) | ||
| })?; | ||
|
|
||
| // === Stage B: Parallel per-account storage root computation === |
There was a problem hiding this comment.
Pre-warm opens one
state_trie handle per account
The rayon closure calls open_state_trie(parent_state_root) for each of the N accounts, creating N independent trie handles before Stage B opens another 16 (one per worker). If open_state_trie acquires any internal lock, allocates a per-handle arena, or does RocksDB snapshot work, N parallel opens could serialise or spike allocation far beyond what Stage B already requires. Worth verifying that open_state_trie is a lightweight read-view wrapper, or considering a single trie opened per rayon thread rather than per item.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/blockchain/blockchain.rs
Line: 921-932
Comment:
**Pre-warm opens one `state_trie` handle per account**
The rayon closure calls `open_state_trie(parent_state_root)` for each of the N accounts, creating N independent trie handles before Stage B opens another 16 (one per worker). If `open_state_trie` acquires any internal lock, allocates a per-handle arena, or does RocksDB snapshot work, N parallel opens could serialise or spike allocation far beyond what Stage B already requires. Worth verifying that `open_state_trie` is a lightweight read-view wrapper, or considering a single trie opened per rayon thread rather than per item.
How can I resolve this? If you propose a fix, please make it concise.
Summary
Decouple BAL-merkleization from EVM execution on the
engine_newPayloadvalidation path. Synthesize per-field state deltas from the inputBlockAccessListbeforethread::scopeand run merkle stages B/C/D in parallel with execution + warming.validate_block_access_list_hashpost-execution remains the correctness gate; on mismatch the optimistic merkle output is discarded by the existing?error flow.Closes #6584. Validation path only (builder still streams). Amsterdam+ only.
Design
BalSynthesisItem { address, balance, nonce, code_hash, code, added_storage }incrates/common/types/block_access_list.rscarries per-field optionals — noOption<AccountInfo>blob, so a balance-only ETH-transfer recipient doesn't fabricate zero-nonce /EMPTY_KECCACK_HASHdeltas that would corrupt the trie.BalStateWorkItemrefactored to per-field optionals so the streaming flow and the synthesis flow lower into the same shape.handle_merkleization_bal→handle_merkleization_bal_from_updates(prepared: FxHashMap<Address, BalSynthesisItem>, parent_header). Stage A (channel drain) deleted from the BAL path.execute_block_pipelinesynthesizesoptimistic_updates+optimistic_witnessbeforethread::scope. No channel + no drain thread on the BAL path.LEVM::execute_block_pipeline/execute_block_parallelandEvm::execute_block_pipelinenow takeOption<Sender>. The validation-with-BAL caller passesNoneand the EVM skips its ownbal_to_account_updates+merkleizer.send(levm/mod.rs:1034-1038) — that was redundant work (and did pre-state lookups we no longer need).generate_witness_from_account_updatesonly needsaddress+added_storage.keys(); the bulk of the witness comes fromDatabaseLogger.state_accessed.merkle_start_instantcaptured so the pipeline metric line showsstart_delay(how much of execution the merkleizer overlapped with).Perf refinements (inside
handle_merkleization_bal_from_updates)accounts.par_iter()before Stage B — warms RocksDB OS pages + trie node arena for the addresses Stage B/C are about to touch. Replaces what the old streamingbal_to_account_updateswas doing viastore.prefetch_accounts.item.added_storagebefore the per-slot insert loop so trie inserts walk node-arena paths in order instead of bucket order.Correctness
removed/removed_storageintentionally not inferred from BAL. Stage B'svalue.is_zero()+ Stage C's EIP-161 normalization collapse accounts identically to the streaming path. Verified for plain selfdestruct (EIP-6780 only emitsrecord_balance_change(to, 0)on same-tx-created accounts; pre-existing contracts degrade to a balance transfer).storage_readsare omitted from the synthesized map (no state delta to apply; witness path captures their reads fromDatabaseLogger.state_accessed).Tests
synthesize_bal_updatesinblock_access_list.rs: read-only skip, pure storage write, balance-only / nonce-only / code-only (regression cases for the partial-info bug avoided by the per-field design), last-wins on each delta vec, zero storage retained, defensive emptyslot_changes, account creation, EIP-6780 selfdestruct collapse.tooling/ef_tests/blockchain/test_runner.rs:192exercises this code path on every Amsterdam fixture (pass 2 callsadd_block_pipeline(block, Some(bal))).Follow-ups (not in this PR)
synthesize_bal_updatesitself viabal.accounts().par_iter(). Thekeccak(new_code)of each code-changed account is the serial-on-main-thread hotspot.Test plan
cargo test -p ethrex-common synthesize_tests(13/13)cargo fmt --checkcargo clippy -p ethrex-blockchain --no-deps -- -D warnings(clean; pre-existingMAX_BLOB_TX_SIZEwarning is feature-gated byc-kzgon main)balgrouptooling/ef_tests/blockchain) — exercises the new path via the two-pass parallel runner on every Amsterdam fixture