Skip to content

feat: add state size delta and trie depth#722

Open
weiihann wants to merge 6 commits into
ethpandaops:masterfrom
weiihann:feat/trie-depth
Open

feat: add state size delta and trie depth#722
weiihann wants to merge 6 commits into
ethpandaops:masterfrom
weiihann:feat/trie-depth

Conversation

@weiihann
Copy link
Copy Markdown

@weiihann weiihann commented Dec 29, 2025

In discussion with @samcm to utilize ethereum/go-ethereum@01b39c9 because there are issues with the debug_stateSize method for state size collection.

@weiihann weiihann marked this pull request as ready for review January 8, 2026 04:56
@weiihann weiihann requested a review from Savid as a code owner January 8, 2026 04:56
@weiihann weiihann changed the title feat: add canonical_execution_trie_depth feat: add state size delta and trie depth Jan 8, 2026
Comment thread deploy/migrations/clickhouse/089_execution_mpt_depth.up.sql Outdated
@weiihann weiihann marked this pull request as draft January 12, 2026 02:43
@weiihann
Copy link
Copy Markdown
Author

Converted to draft since we have to implement the live tracer in geth first

weiihann added 4 commits May 14, 2026 11:42
rename

add parent root

add bytes

fix

refactor: use map for depth

feat(proto): add execution block state metrics

feat(event-ingester): add execution block state metrics

chore

feat: add new module

revert
…kHouse

The execution_state_size_delta table previously stored precomputed signed
deltas (account_delta, account_bytes_delta, ...). That representation lost
information: a block that updates one account looked the same as a block
that adds and removes accounts whose totals cancel.

This refactor splits each metric into a (writes, deletes) pair carrying
the gross churn:

  - 20 stored Int64 columns covering 5 categories x {count, bytes} x
    {writes, deletes}: accounts, account trie nodes, contract code,
    storage slots, storage trie nodes.
  - 10 MATERIALIZED Int64 columns derived as (writes - deletes) preserve
    the original delta semantics for any query that wants the net change.

An "update" counts as both a write of the new value and a delete of the
prev value, so (writes - deletes) recovers the net delta for all three
cases (create / update / delete).

Contract code remains write-only in geth's state sizer (it is
content-addressed and shared across accounts, so deletion would require
reference counting). The contract_code_deletes / contract_code_delete_bytes
columns are present for schema symmetry and stay 0 until upstream geth
grows ref-counting; contract_code_delta therefore equals contract_code_writes.

Touches the geth tracer (writes/deletes nested JSON instead of delta),
the proto (renamed *_delta fields to *_writes and added *_deletes),
the Vector transforms (sentry-logs normaliser + kafka-to-clickhouse VRL),
and the ClickHouse migration. Verified end-to-end with the test-geth
1M-block mainnet import: 1,000,001 rows in each table, MATERIALIZED
delta identity holds for every row.
weiihann added 2 commits May 14, 2026 12:15
…igration

Both tables are populated from the same geth "State metrics" log line and
have no purpose without each other. Splitting them across two migration
files (003_execution_state_size_delta + 004_execution_mpt_depth) added
no value and made the ordering brittle when rebasing past unrelated
upstream migrations.

The combined 003_state_metrics.{up,down}.sql holds both CREATE/DROP
statements. The migrator's behaviour is unchanged — golang-migrate
splits on ';' and runs each statement individually, identical to running
two separate files.
Adds the hand-written and codegen pairs that let xatu-consumoor land
EXECUTION_STATE_SIZE_DELTA and EXECUTION_MPT_DEPTH events directly into
ClickHouse via the protobuf path.

Also fixes a chgo-rowgen omission: MATERIALIZED columns must not appear
in the insert column list. Previously the tool included them, which would
have caused INSERT to fail with "Cannot insert column ... — it is
MATERIALIZED". The fix filters default_kind IN ('MATERIALIZED', 'ALIAS')
from the system.columns query so the generated struct only carries
columns the consumer can actually write to. The fast_confirmation
.gen.go is regenerated as a side effect (only the map capacity hint
changed; no behavioural difference).

execution_mpt_depth.go also adapts the proto's map<uint32, uint64>
(proto3 has no uint8) to ClickHouse's Map(UInt8, UInt64) via toDepthMap.
Trie depths are bounded by [0, 64] per the geth state sizer; entries
with out-of-range keys are dropped defensively.

Verified end-to-end with the 1M-block test-geth import: 1,000,001 rows
in each table, the 10 MATERIALIZED delta identities hold for every row,
and the per-depth Map columns are populated correctly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants