feat(l1): ethrex-evm CLI for goevmlab differential fuzzing#6628
Draft
edg-l wants to merge 24 commits into
Draft
Conversation
Add per-opcode StructLog tracing wired through `debug_traceTransaction` and `debug_traceBlockByNumber`. Output is byte-compatible with geth's `structLogLegacy` JSON shape so goevmlab and EELS reference traces diff cleanly. - `crates/common/tracing.rs`: `StructLog`, `MemoryChunk`, `StructLogResult` with manual Serialize impls matching geth's `toLegacyJSON` field-by-field (decimal pc/gas/gasCost, opcode-name op string, geth uint256 hex stack, chunked memory, accumulated per-contract storage at SLOAD/SSTORE, omitempty refund/error/returnData). - `crates/vm/levm/src/struct_log_tracer.rs`: `LevmStructLogTracer` with zero-cost `active: bool` gate. `pre_step_capture` + `finalize_step` keep the dispatch-loop hook to one branch when disabled. - `crates/vm/levm/src/vm.rs`: dispatch-loop hook (pc captured pre-advance, stack reversed top→bottom for wire format), helper methods, end-of-tx capture using post-refund `gas_spent` (matches geth's `receipt.GasUsed`). - `crates/vm/levm/src/opcode_handlers/system.rs`: explicit `gasCost` for CALL/CALLCODE/DELEGATECALL/STATICCALL/CREATE/CREATE2 = `intrinsic + callGasTemp`, matching geth. - `crates/vm/backends/levm/tracing.rs`, `crates/vm/tracing.rs`, `crates/blockchain/tracing.rs`: `trace_tx_struct_log` plumbing. - `crates/networking/rpc/tracing.rs`: `TracerType::StructLogger` variant (alias `structLog`), `StructLogTracerConfig` with five geth-aligned flags. Default tracer remains `CallTracer` for Blockscout compat; documented divergence from geth. - Tests: 21 unit tests pinning every wire-format rule against geth source; 2 LEVM unit tests for dispatch+SSTORE; 3 fixture-diff integration tests (`eip3155_sstore_basic.json`, `eip3155_mstore_memory.json`, `eip3155_identity_return_data.json`). - `tooling/scripts/gen_structlog_fixtures.sh`: pinned-geth regen procedure. - `crates/vm/levm/benches/struct_log_disabled.rs`: Criterion microbench for the disabled hot path (~7.7 µs per 2k-opcode loop on dev machine).
Place the three structLog fixture JSONs alongside their integration tests in the project's standard test crate (test/tests/levm/fixtures/) instead of cmd/ethrex/tests/fixtures/. Updates the loader path in struct_log_tracer_tests.rs and the regen-script doc comments.
Wire-format rules and per-opcode capture semantics are already pinned by the 26 unit tests in ethrex-common and the 2 LEVM unit tests. The three JSON fixtures + gen helper + geth-targeted shell script added ~600 LoC of snapshot machinery for end-to-end coverage that one focused smoke test can provide. The smoke test exercises the full RPC pipeline (`LEVM::trace_tx_struct_log` -> `serde_json::to_value`) on a `PUSH1 PUSH1 SSTORE STOP` program and asserts the resulting JSON has the EIP-3155 strict wrapper (`pass`/`gasUsed`/`output`/`structLogs`) and per-step shape (numeric `op`, `opName`, hex `gas`/`gasCost`/`refund`, always-present `returnData`/`memSize`, single-entry `storage` on SSTORE). Removes: - test/tests/levm/fixtures/*.json (3 files) - test/tests/levm/struct_log_fixture_gen.rs - tooling/scripts/gen_structlog_fixtures.sh (geth comparison script, obsolete after the move to EIP-3155 strict)
"structLog" inherits geth's Go-type jargon and now misleads consumers: since this PR moved to strict EIP-3155 output, clients sending `tracer: "structLog"` expecting geth's structLogLegacy shape get a different format. The new name is self-describing and sits naturally beside `callTracer` and `prestateTracer`. No aliases — clients pass `"opcodeTracer"` explicitly.
Match the prestateTracer/callTracer convention: all tracer tests live under `test/tests/levm/`, none in `crates/`. - Deletes the 26 unit tests in `crates/common/tracing.rs` (Serialize field-by-field assertions). The single-field tests were dev scaffolding; end-to-end coverage now lives in `opcode_tracer_tests.rs`. - Deletes the 2 LEVM unit tests + in-tracer TestDb harness in `crates/vm/levm/src/opcode_tracer.rs`. Same coverage rebuilt as bytecode-driven e2e tests via the real `LEVM::trace_tx_opcodes` entry point and the shared `TestDatabase` fixture. - Renames `struct_log_tracer_tests.rs` -> `opcode_tracer_tests.rs` to match the existing `prestate_tracer_tests.rs` naming. - New e2e set covers: basic execution + wrapper shape, single-entry storage on SSTORE, memory capture under `enableMemory`, return-data capture under `enableReturnData`, stack=null under `disableStack`.
The microbench (PUSH1/POP loop, no `main` baseline, no enabled-path variant) only confirmed the disabled-path branch is hot-path-clean. It served as dev scaffolding; not worth carrying long-term.
Renaming `StructLogger` to `OpcodeTracer` made all three variants share the `Tracer` suffix, which clippy flags via `enum_variant_names`. The suffix is required because `rename_all = "camelCase"` derives the externally-fixed wire names `callTracer` / `prestateTracer` / `opcodeTracer` from the variant identifiers.
- Add CLZ (0x1E) and SLOTNUM (0x4B) names to opcode_name. - Drop redundant OpcodeTracerRpcConfig; deserialize OpcodeTracerConfig directly. - Replace total_size counter with last_step_captured flag so finalize_step doesn't clobber the last retained entry once the limit cap is hit. - Use call frame `to` (storage context) instead of `code_address` for SLOAD/SSTORE capture so DELEGATECALL/CALLCODE record under the caller's account. - Replace unsafe transmute-based U256->H256 with BigEndianHash::from_uint. - Narrow SLOAD error fallback: only AccountNotFound returns zero; other DB errors omit the storage entry instead of fabricating a value.
- omit `opName` for unknown opcodes (geth-compatible, EIP-3155 allows) - SLOAD: omit storage entry on any read failure, not just non-AccountNotFound - drop unused `addr` from `read_storage_for_trace` tuple
Align debug_traceTransaction output with the cross-client structLogger shape:
{failed, gas, returnValue, structLogs}; per-step gas/gasCost/refund as numbers,
op as the mnemonic string (opName dropped). EIP-3155 step content preserved
(memSize, returnData, refund always emitted).
Fix: jump() fused JUMP/JUMPI with the destination JUMPDEST, dropping the
JUMPDEST step from the trace and inflating the parent's gasCost by 1. Now
synthesizes a JUMPDEST entry when the tracer is active; the disabled hot path
keeps the fusion. last_step_captured replaced with last_step_index so the
synthetic entry doesn't shadow the parent's finalize_step patch target.
…ernode - ethereum-package pinned at e4b3305 (2025-04) -> 71b02f6 (current main). Required to pick up the besu launcher fix that drops CLIQUE from --rpc-http-api (besu 26.x removed the namespace). - geth v1.15.2 -> v1.17.3; besu main-142a5e6 -> main-6d54451; lighthouse v8.0.0-rc.1 -> v8.1.3 (the rc is over a year old). - supernode: true on the ethrex participant. With Fulu at epoch 0, the package now requires at least one supernode, a node with 128+ validators, or perfect_peerdas_enabled.
Adds free-function writers (`write_streaming_step`, `write_streaming_summary`, `write_streaming_state_root`) producing geth/`evm --json` byte-compatible output. Keeps the existing RPC structLogger Serialize impl untouched; the two shapes coexist for different consumers (RPC wrapper vs. streaming CLI). The `stateRoot` summary line preserves the literal `"stateRoot": "` colon-space that goevmlab byte-searches for. Tests under test/tests/common/tracing_streaming_tests.rs anchor the streaming shape against captured `evm v1.17.3 run --json 0x6001600101` output and snapshot the legacy RPC shape to catch accidental drift.
Adds `LevmOpcodeTracer::streaming(cfg, sink)` that flushes each finalized
opcode step directly to a `Box<dyn Write>` instead of accumulating in `logs`,
giving O(1) peak memory for long traces. The RPC mode (no sink) is byte-for-
byte unchanged — the legacy structLogger Serialize impl and its snapshot test
are untouched.
`finalize_step` patches the parent step at `last_step_index`, then walks
`logs[idx..]` and writes each line in order — ensuring fused-JUMPDEST
synthetic steps arrive AFTER their parent JUMP/JUMPI rather than being
emitted mid-handler.
`flush_summary` and `flush_state_root` are exposed for the upcoming
ethrex-evm CLI: the latter emits `{"stateRoot": "0x..."}` with the literal
colon-space goevmlab byte-searches for.
Both `pre_step_capture` and `synthesize_step` honor the cap against
`streamed_count + logs.len()` so streaming doesn't silently drop the limit
once `logs` is truncated. After a write failure the tracer stops accumulating
into `logs` entirely; the caller drains the error via `take_stream_error`.
Tests live under test/tests/levm/opcode_tracer_streaming_tests.rs.
Adds a new cmd/ethrex-evm/ workspace member that will host the goevmlab- compatible CLI binary. Phase 3 lands the foundation: * Crate skeleton (Cargo.toml in main workspace, placeholder main.rs, lib.rs re-exports). * `compute_post_state_root(pre_state, updates) -> H256` builds an in-memory Store from a Genesis derived from the pre-state, calls add_initial_state via a one-shot tokio runtime, then synchronously runs apply_account_updates_batch and returns the resulting state_trie_hash. * Inlined GeneralStateTest types under statetest/types.rs (Option B chosen because tooling/ef_tests/state lives in a separate Cargo workspace and pulls in revm + simd-json). * Module-level rustdoc in statetest/mod.rs documenting the spike findings and the call sequence Phase 4 will follow per subtest. Tests in cmd/ethrex-evm/tests/state_root.rs pin a captured H256 from a transfer scenario so future trie/encoding changes flag visibly, plus a determinism check on the empty-updates path.
`ExceptionalHalt::StackUnderflow` now carries `{ stack_len, required }` and
`StackOverflow` carries `{ stack_len, limit }`. Display emits the conventional
parameterized strings ("stack underflow (2 <=> 3)", "stack limit reached
1024 (1024)") used across major clients, so the upcoming ethrex-evm CLI can
surface goevmlab-diff-friendly error text without a custom mapping table.
Touches the stack-op call sites (pop, pop1, push, swap, dup) plus SWAPN /
EXCHANGE / DUPN handlers. No behavior change beyond the error strings.
Adds the `ethrex-evm statetest` subcommand — the primary goevmlab
integration target. Reads GeneralStateTest JSON paths (positional or
newline-separated via stdin batch mode), executes each (fork, subtest) pair
through LEVM, streams EIP-3155 lines on stderr, terminates each test with
`{"stateRoot": "0x..."}` matching the byte sequence goevmlab's parser
searches for.
Geth-aligned CLI surface:
statetest --trace --trace.format=json --trace.nomemory=BOOL
--trace.nostack=BOOL --trace.noreturndata=BOOL
[--statetest.fork=NAME] [--statetest.index=I] [--run=REGEX]
[PATH...]
`--trace` is a bare boolean (presence ⇒ on) so `statetest --trace <path>`
works the way goevmlab invokes it.
The error_map module maps LEVM `VMError` / `ExceptionalHalt` /
`TxValidationError` variants to the strings geth's `core/vm/errors.go` and
`core/state_transition.go` emit ("intrinsic gas too low", "out of gas",
"write protection", etc.) so byte-diff over stderr matches.
The transaction envelope is chosen at runtime: legacy (`gasPrice`-only)
templates produce a `LegacyTransaction`; templates with `maxFeePerGas` or
`maxPriorityFeePerGas` produce an `EIP1559Transaction`. Forcing every tx
through a 1559 envelope would corrupt legacy fee math.
A `Run` subcommand stub is also wired in with all its geth-aligned flags
pre-defined; the body bails with "Phase 5: not yet implemented".
Tests under cmd/ethrex-evm/tests/ cover:
- 12 error-map assertions (one per mapped variant)
- 3 state-root determinism / stability tests (from Phase 3)
- 4 binary integration tests: positional path with trace, stdin batch mode,
unsupported --trace.format exit code, run-stub message.
The committed fixture exercises the TxValidation error path; a happy-path
fixture (where opcodes stream) needs runner/Fork wiring untangled and
is tracked as a follow-up.
Bump fixture gasLimit 21000 → 100000 so the tx executes instead of being rejected at LEVM's intrinsic-gas validation. The recipient has empty code, so the trace contains exactly one opcode line (implicit STOP) plus the summary + stateRoot terminator — covering the streaming path end-to-end. Updates the pinned post-state hash and the integration tests: - assert at least one opcode line streams before the summary - assert the summary has no `error` field on happy path The "intrinsic gas too low" assertion (TxValidation path) is dropped; that path is now covered by the error_map unit tests instead. Note: 21000 gas on an empty-calldata transfer should pass under Shanghai intrinsic accounting (21000 base, 0 calldata). LEVM rejects it for an as- yet-unexplained reason — tracked separately. Not a blocker for goevmlab integration since real EF tests use higher gas limits.
Without a custom deserializer, `Vec<Bytes>` on `TestTransaction.data` was
parsing the JSON string "0x" as the literal two ASCII bytes ('0','x'),
turning what should be empty calldata into 2 nonzero bytes. The 32 extra
gas (2 * 16) pushed every basic transfer's intrinsic above its gas_limit
and made the binary reject txs that geth and the existing ef_tests state
runner accept.
Adds `deser_vec_hex_bytes` so each entry is hex-decoded ("0x" -> empty
Bytes). The fixture now executes at exactly 21000 gas; the trace shows
the implicit STOP at gas=0 and gasUsed=0x5208.
Also align Genesis construction with `tooling/ef_tests/state`'s
`Genesis::from(&EFTest)`: leave `config` at Default (all forks inactive
in the chain header) and let LEVM's `EVMConfig` drive fork-specific
behavior at execution time. The bespoke `minimal_chain_config()` that
activated everything at block 0 was leaking Amsterdam/Prague checks
into Shanghai runs.
Tx envelope also reverted to always-EIP1559 with default fee fields,
mirroring `levm_runner::prepare_vm_for_tx`. The previous legacy/1559
branching wasn't needed; LEVM reads effective fees from the Environment.
Implements `ethrex-evm run [--json] [--codefile=- | <path>] [bytecode]` matching geth's `evm run` CLI surface: positional hex bytecode (or via --codefile, stdin with `-`), `--gas` default 10_000_000_000, geth's left-padded `"sender"` / `"receiver"` ASCII as default addresses, `--value` accepts hex or decimal (like geth's math.HexOrDecimal256), `--statdump` post-execution stats, `--ethrex-fork=NAME` override. `--json` enables EIP-3155 streaming on stderr. Default trace modifiers match geth: --nomemory=true, --noreturndata=true, --nostack=false. Non-JSON mode prints `0x<output>\n` to stdout (and ` error: <err>` on stderr for revert), matching geth. Bridges the intrinsic-gas gap: LEVM goes through full tx processing (deducts 21000 + calldata cost up-front), geth's `runtime.Call` bypasses it. The implementation adds intrinsic gas to the gas_limit before execution and subtracts it from the reported gasUsed, so each per-step `gas` field matches geth's exactly. Module docs spell out the limits of this approach (Amsterdam+ reservoir, SSTORE-refund corner). Tests: - 6 integration tests under cmd/ethrex-evm/tests/run_tests.rs. - Test 6 byte-exactly diffs `ethrex-evm run --json 0x6001600101` against a committed golden file captured from geth v1.17.3 (also pinned in tests/fixtures/GETH_VERSION.txt). - All 25 ethrex-evm tests pass.
Extends the statetest runner to handle three more shapes found in real
EF / execution-spec-tests vectors:
- **EIP-4844 (type-3) blob txs.** TestTransaction gains
`blobVersionedHashes` and `maxFeePerBlobGas`; TestEnv gains
`currentExcessBlobGas`. These are passed through the LEVM Environment
(tx_blob_hashes, tx_max_fee_per_blob_gas, block_excess_blob_gas) so
blob fee math is correct. No new tx envelope — blob vectors continue
to use EIP1559Transaction, matching tooling/ef_tests/state.
- **EIP-7702 (type-4) setcode txs.** Inlines TestAuthTuple
({chainId, address, nonce, v|yParity, r, s}) and adds
`authorizationList` to TestTransaction. Runner dispatches to
Transaction::EIP7702Transaction when an auth list is present; rejects
setcode-creation (no contract creation in type 4).
- **`sender` field fallback.** TestTransaction.secretKey is now
optional; runners check `sender` first and only derive from
secretKey when neither is present.
Three new fixtures pin the stateRoot for each path:
- statetest_blob_tx.json -> 0xe15c0784...
- statetest_setcode_tx.json -> 0x5233f79c...
- statetest_sender_field.json -> 0xeb9c2278...
Adds cmd/ethrex-evm/README.md covering both subcommands, flag tables,
EIP-3155 line schema, goevmlab integration recipe, and the golden-
fixture regeneration procedure.
Error-map: all type-3/4 TxValidationError variants were already mapped
in a prior phase; no new arms needed.
Tests: 28 in cmd/ethrex-evm pass (12 error_map + 3 state_root + 7
statetest_runner including the 3 new fixtures + 6 run).
Lines of code reportTotal lines added: Detailed view |
`run` was a secondary dev tool; goevmlab integration only uses the statetest path. Removing it cuts ~430 LOC of production + ~335 LOC of tests + the geth golden fixture without affecting the goevmlab goal. Also drops the four committed statetest JSON fixtures and the two integration test files that depended on them. Unit coverage of the internal pieces (error mapping table, post-state-root helper, type parsing via serde) stays in place; end-to-end binary validation moves to goevmlab itself, which exercises the same path continuously. README slimmed to statetest-only. `run`, `t8n`, and a CI fuzzing workflow are now listed as future work. Net: -1504 lines across the crate. Test count goes from 28 to 15.
Benchmark Results ComparisonNo significant difference was registered for any benchmark run. Detailed ResultsBenchmark Results: BubbleSort
Benchmark Results: ERC20Approval
Benchmark Results: ERC20Mint
Benchmark Results: ERC20Transfer
Benchmark Results: Factorial
Benchmark Results: FactorialRecursive
Benchmark Results: Fibonacci
Benchmark Results: FibonacciRecursive
Benchmark Results: ManyHashes
Benchmark Results: MstoreBench
Benchmark Results: Push
Benchmark Results: SstoreBench_no_opt
|
47dd368 to
f49a152
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a new
ethrex-evmbinary (undercmd/ethrex-evm/) that is binary-compatible withholiman/goevmlab's fuzzing harness, enabling ethrex/LEVM to participate as a target client in cross-EVM differential fuzz campaigns.One subcommand:
statetest. ReadsGeneralStateTestJSON (positional path or stdin batch), executes each(fork, subtest)through LEVM, streams EIP-3155 trace lines on stderr, and emits a{"stateRoot": "0x..."}terminator line (the literal byte sequence goevmlab's parser searches for).goevmlab smoke test (verified end-to-end)
The "binary-compatible with goevmlab" claim is empirically verified, not theoretical. A goevmlab adapter (
evms/ethrex.go) was authored, registered alongside the existing client adapters, and exercised viacmd/runtestagainst two state-test fixtures:PUSH1 1 PUSH1 1 ADD STOP(real opcode lines stream).Both fixtures produced byte-identical MD5 output between
ethrex-evmandgeth v1.17.3under goevmlab'sCompareFiles. No consensus flaw reported.Verified end-to-end:
JsonlScannerparses our per-step lines and the summary line.ParseStateRootfinds our{"stateRoot": "0x..."}terminator via the literal-byte search (which is why the colon-space matters).Copyfilter produces canonical output matching geth's adapter byte-for-byte.The adapter is staged for upstream: https://github.com/edg-l/goevmlab/tree/feat/ethrex-adapter
What's in this PR
Ten commits, layered from the format primitives up to the binary:
crates/common/tracing.rs) — free-function writerswrite_streaming_step,write_streaming_summary,write_streaming_state_rootthat emit the conventional cross-clientstructLoggerbyte sequence per step plus the{"stateRoot": "0x..."}terminator (the colon-space is required for goevmlab's literal byte search).LevmOpcodeTracer—Option<Box<dyn Write>>sink that flushes each finalized step immediately and drops it fromlogs, giving O(1) peak memory regardless of trace length. RPC mode unchanged (snapshot-tested).compute_post_state_roothelper + crate scaffold — bridges LEVM'sget_state_transitionswithStore::apply_account_updates_batchvia a one-shot tokio runtime, behind a sync API.ExceptionalHalt::StackUnderflow { stack_len, required }andStackOverflow { stack_len, limit }so theDisplayimpl produces the conventional"stack underflow (N <=> M)"/"stack limit reached L (N)"strings.statetestsubcommand — clap CLI matching geth's flag set (--trace,--trace.format=json,--trace.nomemory=BOOL,--statetest.fork=NAME, etc.), file collection + stdin batch mode, EF-TestTransactionJSON parsing (inlined types — see below), exhaustiveVMError → geth-stringerror mapping.TestTransaction; envelope dispatch in the runner;senderfield fallback whensecretKeyis absent."0x"was being treated as ASCII bytes'0','x', over-charging intrinsic by 32 gas), Genesis chain-config defaulting (prevents fork leakage from in-memory store), legacy/1559 envelope unification (mirrorlevm_runner.rs).Format details
Per-step line shape (matches
geth/eth/tracers/logger/gen_structlog.go):{"pc":4,"op":1,"gas":"0x2540be3fa","gasCost":"0x3","memSize":0,"stack":["0x1","0x1"],"depth":1,"refund":0,"opName":"ADD"}opis the decimal byte value (opNamecarries the mnemonic).gas/gasCostare hex strings;memSize/depth/refundare decimal.memory/returnDataomitted unless explicitly enabled.stackis bottom-first.Summary line:
{"output":"<hex no 0x>","gasUsed":"0x...","error":"..."}State-root terminator (the goevmlab handshake):
{"stateRoot": "0x<64 hex chars>"}Supported transaction shapes
EIP1559Transaction).blobVersionedHashes,maxFeePerBlobGas,currentExcessBlobGas).authorizationListwithvoryParity).senderfield instead ofsecretKey.Test coverage
test/tests/common/tracing_streaming_tests.rs(one per field-encoding rule + a snapshot of the legacy RPC shape).test/tests/levm/opcode_tracer_streaming_tests.rs(basic streaming, synthetic JUMPDEST ordering, cap honored across real+synth, write-failure path, flush summary/state-root, RPC-mode regression).Inlined GeneralStateTest types
cmd/ethrex-evm/src/statetest/types.rsre-declares the minimum subset of EF'sGeneralStateTestschema needed by the binary. Rationale:tooling/ef_tests/statelives in a separate Cargo workspace and pulls inrevm+simd-json, neither of which we want in the main workspace for a release binary. The inlined types are annotated with the upstream source location.What's NOT in this PR
runsubcommand (raw bytecode execution). Dropped to keep this PR focused on the goevmlab path. Tracked for a follow-up.t8nsubcommand. goevmlab does not use it.goevmlab/evms/ethrex.goadapter. Staged at edg-l/goevmlab#feat/ethrex-adapter; will be opened upstream after this binary lands.Test plan
cargo test -p ethrex-evm(15 unit tests pass)cargo test -p ethrex-test --test ethrex_tests(streaming serializer + tracer sink tests)cargo test -p ethrex-levm --lib(no regressions from the stack-error variant change)debug_traceTransaction's RPCstructLogsshape unchanged (snapshot testtest_1_5_legacy_rpc_serialize_snapshotpasses)