Skip to content

feat: merge-train/spartan-v5#24053

Merged
PhilWindle merged 19 commits into
v5-nextfrom
merge-train/spartan-v5
Jun 16, 2026
Merged

feat: merge-train/spartan-v5#24053
PhilWindle merged 19 commits into
v5-nextfrom
merge-train/spartan-v5

Conversation

@AztecBot

@AztecBot AztecBot commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

BEGIN_COMMIT_OVERRIDE
refactor(stdlib)!: thin chain-checkpointed event, collapse sync (#24007)
refactor!: remove proposedCheckpoint tip (#24008)
fix(sequencer): wait for previous L1 block before publishing (#24037)
test(e2e_fees): bridge fee juice from a dedicated L1 account (#24054)
fix(aztec-node): pipelining-aware slot and fee simulation in simulatePublicCalls (#24031)
refactor: move registerContractFunctionSignatures to the node debug API (#24066)
fix: resolve v5-next → merge-train/spartan-v5 conflict (#24072)
chore: merge v5-next into merge-train/spartan-v5 (raw, conflict markers) (#24071)
test(ci): mark e2e_epochs/epochs_mbps_redistribution as flaky (#24098)
fix: isolate cross-chain L1 writes from publisher nonce (#24104)
test(world-state): make delayed-close fork queue-cleanup wait deterministic (#24106)
fix(kv-store): lazy-load skipped browser benchmark deps (#24108)
feat(p2p): slash proposers exceeding max blocks per checkpoint (A-1166) (#24041)
fix(stdlib): fix race conditions in L2BlockStream (#24042)
test(prover-node): make checkpoint store pruning test deterministic (#24130)
END_COMMIT_OVERRIDE

## Motivation

`L2BlockStream.work()` reconciled the four chain tips through
independent code paths — a reorg walk, two
checkpoint loops with a shared cursor, a block loop, and a final
proven/finalized diff — with optimizations
(`startingBlock`, `skipFinalized`) that muted some paths but not others.
Nothing enforced that what one poll
emitted was tier-consistent, which produced a family of bugs (detailed
below). The root enabler was the fat
`chain-checkpointed` event: because it carried a full
`PublishedCheckpoint`, the stream had to fetch, order,
and replay every checkpoint individually, and every special case had to
reason about when that replay could be
skipped.

This PR makes `chain-checkpointed` a thin tip event like
`chain-proven`/`chain-finalized` and collapses
`work()` into three symmetric steps: reorg detection, one block-download
loop, and unconditional end-of-pass
tier reconciliation from a single source snapshot. The two consumers
that relied on per-checkpoint payload
delivery are first decoupled: the sentinel stops needing a block stream
at all, and the prover-node drives its
own checkpoint catch-up from a cursor.

## Bugs and races fixed

- **Checkpointed cursor stuck at genesis under `startingBlock`
(A-1061).** The `startingBlock` fast-forward
arm suppressed all checkpoint emission while the end-of-pass
`chain-proven` still fired, leaving the local
store with `proven > checkpointed` and the checkpointed cursor at
genesis until the source's checkpointed
tip passed `startingBlock`. In p2p this degraded `isEpochPrune` (epoch
prunes misclassified as ordinary
prunes, so `txPoolDeleteTxsAfterReorg` was not honored during the
window); the same pattern affected the
sentinel and prover-node streams. Tier reconciliation is now
unconditional against the source snapshot, so
the first pass converges. Regression: *emits the source checkpointed tip
on the FIRST pass even when
  startingBlock is past it*.
- **Duplicate / out-of-order checkpoint events after a reorg
(stale-snapshot guard).** Loop 1's "blocks
already local" guard compared against the `localTips` snapshot taken
before the prune handler rewrote the
store, so a node lagging a checkpoint that detected a reorg in the same
pass could emit `chain-checkpointed`
for a new-fork checkpoint before its blocks, then re-emit checkpoints
from the prune target upward. Loop 1
is deleted; reconciliation re-reads the local tips after a prune so it
compares against the just-clamped
cursors. Regression: *does not re-emit the checkpointed tip after
pruning to a block ahead of it*.
- **Same-height stale cursors never refreshed (number-only
comparisons).** The proven/finalized gates compared
block numbers only, so a same-number/different-hash tip after a reorg
kept a stale (hash, checkpoint-id)
pair until the tip next advanced. Gates now compare (number, hash) —
skipping the hash comparison when the
local hash is `undefined`, which world-state legitimately reports for
tips ahead of its synced range
(comparing against it would re-emit the event on every poll).
Regressions: *re-emits the proven tip when
numbers match but the known local hash differs* and *does not re-emit
when the local hash is undefined*.
- **Sentinel credited attestors on reorged-out checkpoints.** Its
`slotNumberToCheckpoint` map had no
`chain-pruned` handling, so a reorged-out checkpoint's attestation entry
lingered and `getSlotActivity`
could classify validators against a non-canonical checkpoint. The map
(and the stream feeding it) is gone;
the sentinel fetches the checkpoint for a slot on demand, so the answer
is always canonical.
- **Read-skew race between `getL2Tips` and `getCheckpoints`.** The old
checkpoint loops fetched payloads in
separate reads from the tips snapshot; a source-side reorg between the
two reads could plant inconsistent
state in the local store (the reason catch-up code needed
validation-and-abort machinery). Eliminated
structurally: every tier event is now built from the same `getL2Tips`
snapshot, with no second fetch to
  skew against.
- **Prover-node restart could have skipped unproven checkpoints
(prevented by design here).** With catch-up
driven by a cursor, seeding it from a checkpointed tip would silently
skip the unproven checkpoints of a
partially-proven epoch on restart. The cursor seeds from the last
checkpoint of the last fully-proven epoch
(or 0), and advances only after both `checkpointStore.addOrUpdate` and
`sessionManager.onCheckpointAdded`
succeed, preserving the existing at-least-once retry semantics (A-1041).
Epoch expiry additionally gets a
periodic tick, since it previously piggybacked on per-checkpoint event
volume.

## Approach

- **Sentinel (decoupled first):** deletes its `L2BlockStream`,
`L2TipsMemoryStore`, slot→checkpoint map, and
the manual stream sync in its work loop. `getSlotActivity` fetches
`archiver.getCheckpoint({ slot })` on
demand (the by-slot query already existed) and derives the same
attestation data; the p2p-sync gate reads
  the archiver tips directly instead of a stale local mirror.
- **Prover-node (decoupled second):** on a thin `chain-checkpointed` tip
event, walks every checkpoint between
its cursor and the tip: light `getCheckpointsData` metadata first,
whole-epoch relevance filtering (an epoch
is skipped only if fully proven or past its proof-submission window —
never individual checkpoints inside a
provable epoch, which the SessionManager's full-coverage contract
requires), then a heavy `getCheckpoint`
fetch only for checkpoints it will actually register. This is strictly
cheaper than the old stream replay,
which transferred every full checkpoint payload before the prover-node
could decide to skip it.
- **Stream rewrite:** `chain-checkpointed` becomes `{ block: L2BlockId,
checkpoint: CheckpointId }`, emitted at
most once per pass — symmetric with `chain-proven`/`chain-finalized`.
`work()` is now: reorg walk-back +
`chain-pruned`; one `getBlocks` download loop (start incorporates
`startingBlock`/`skipFinalized`);
end-of-pass reconciliation checkpointed → proven → finalized from one
snapshot. PXE, the one remaining
payload consumer, fetches its anchor header by hash (reorg-safe) and
skips the update if the block vanished.

## API changes

Internal API only (no RPC schema changes):

- `L2BlockStreamEvent`: `chain-checkpointed` carries `{ block,
checkpoint }` ids instead of a
`PublishedCheckpoint`, and fires at most once per sync pass instead of
once per checkpoint. Consumers
  needing payloads fetch them from the block source.
- `L2BlockStream` source narrows to `Pick<L2BlockSource, 'getBlocks' |
'getBlockData' | 'getL2Tips'>`; the
`checkpointPrefetchLimit` option and `CHECKPOINT_PREFETCH_LIMIT` export
are gone.
- `LocalChainTips.checkpointed` widens to `{ block, checkpoint }` so the
checkpointed tier can hash-gate like
  proven/finalized (still structurally assignable from `LocalL2Tips`).

## Simplifications

- `L2BlockStream.work()`: ~220 → ~115 lines. Deleted: Loop 1
(already-local checkpoint backfill), Loop 2 +
prefetch buffer (checkpoint-transport block download),
`nextCheckpointToEmit`, both `startingBlock`
checkpoint fast-forward arms, and the checkpoint payload fetching
altogether.
- Catch-up emits at most one checkpointed event per pass regardless of
lag — no per-checkpoint replay, no
  multi-emission warning path, no anti-spam special cases.
- Sentinel: net ~50 lines removed plus a whole subsystem dependency
(stream + tips store) — replaced by one
  ~25-line on-demand fetch.
- PXE node adapter: the ~25-line `getCheckpoints` implementation is
deleted; the telemetry stream wrapper
  narrows accordingly.
- Tips stores: `handleChainCheckpointed` reads ids straight off the
event instead of recomputing the
  checkpoint hash from the payload.
- Stream test suite: rewritten from 1,828 to ~640 lines while adding
four regression tests for the bugs above.
- Net across the branch: **−1,228 lines** over 15 files.

## Changes

- **stdlib**: thin `chain-checkpointed` event; collapsed `work()`;
hash-aware tier gates; narrowed stream
  source type; tips-store handler reads event ids directly.
- **stdlib (tests)**: stream suite rewritten around the new event
semantics + regression tests for A-1061,
  post-prune reconciliation, and both hash-gate behaviors.
- **prover-node**: cursor-driven checkpoint catch-up
(`processCheckpointJump`/`registerCheckpoint`/
`computeStartingCheckpoint`), whole-epoch relevance filtering, prune
clamping, periodic epoch-expiry tick.
- **aztec-node (sentinel)**: block stream, tips store, and
slot→checkpoint map deleted; on-demand canonical
  checkpoint fetch; direct archiver read for the p2p-sync gate.
- **pxe**: anchor header fetched by hash on checkpointed-tip events;
node adapter loses `getCheckpoints`.
- **kv-store / telemetry-client**: test-suite and wrapper-type
adjustments to the new event shape.

Fixes A-1061

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
## Motivation

The `proposedCheckpoint` tip on `L2Tips` was the checkpoint frontier
including L1-submitted-but-unconfirmed proposals, sitting between
`proposed` (blocks) and `checkpointed` (confirmed). It existed only for
proposer pipelining and had outlived its usefulness as a tip:

- It was internal-only — never carried on the public RPC surface
(`ChainTips` already excluded it), so no schema or storage migration is
involved in removing it.
- It was degenerate everywhere except the archiver: local stores
(world-state, l2-tips-store) could never represent it (no stream event
carries it; the cursor was always equal to `checkpointed`), which is why
earlier work had already narrowed `LocalL2Tips` to omit it.
- It was fabricated by shims purely to satisfy the type: the PXE node
adapter and the TXE archiver both invented a `proposedCheckpoint:
checkpointed` value.
- Most importantly, the tip and its payload were read from two
independent archiver snapshots — a JS-side tips cache for the tip vs. a
direct store read on `#proposedCheckpoints` for the payload. A
concurrent archiver write could be observed split, which forced explicit
race-detection/refuse-to-proceed code in the sequencer's `checkSync`.

## Approach

Drop the `proposedCheckpoint` tip from `L2Tips` entirely. The few
consumers that needed the proposed-checkpoint frontier (the sequencer
pipelining path, `automine`, and the node's `simulatePublicCalls`) read
the payload directly from the existing archiver method:

```ts
// L2BlockSource / archiver
getProposedCheckpointData(): Promise<ProposedCheckpointData | undefined>;
```

With no argument this returns the latest proposed (not-yet-L1-confirmed)
checkpoint payload, which is always the leading one:
`addProposedCheckpoint` only stores a checkpoint number beyond the
confirmed frontier (sequentiality is enforced), and every confirmation
path (`addCheckpoints`, `promoteProposedToCheckpointed`) deletes the
matching proposed entry in the same transaction. Consumers derive the
proposed tip from the payload — checkpoint number, and last block =
`startBlock + blockCount - 1` — so there is no separate tip to read and
no tip-vs-payload split to reconcile. This removes the snapshot-race
reconciliation code that previously lived in the sequencer's
`checkSync`.

The proposed-but-unconfirmed checkpoint frontier is archiver-internal.
It is **not** exposed on the public node RPC surface, so the
`'proposed'` checkpoint tag is removed from `getCheckpointNumber` and
`getCheckpoint` (see breaking change below).

## Breaking change (RPC)

The `'proposed'` checkpoint tag is no longer accepted by the node's
`getCheckpointNumber(tip?)` and `getCheckpoint(param)` (and is dropped
from `CheckpointParameter`). The proposed-but-unconfirmed checkpoint
frontier was never a public concept; checkpoint-tip selectors are now
`'checkpointed' | 'proven' | 'finalized'`. The block-level `'proposed'`
tag (latest proposed block) on `getBlockNumber` / `getChainTips` is
unchanged.

## API changes (internal)

- `proposedCheckpoint` removed from `L2Tips` and `L2TipsSchema`;
`'proposedCheckpoint'` removed from `L2BlockTag`.
- `L2BlockSource.getProposedCheckpointData(query?)` is unchanged; its
no-arg form returns the leading proposed-checkpoint payload (used by the
sequencer / `simulatePublicCalls`), and its by-number/by-slot forms back
the node's confirmed→proposed `getCheckpoint` fallback.
- The `ChainTip` / `ChainTips` / `ChainTipSchema` / `ChainTipsSchema`
aliases are removed. Block-level call sites use `L2BlockTag` / `L2Tips`
/ `L2TipsSchema` / `BlockTagWithoutLatestSchema` directly;
checkpoint-tip selectors use the new `CheckpointTag` /
`CheckpointTagSchema` (`'checkpointed' | 'proven' | 'finalized'`).
`LocalL2Tips` continues to alias `L2Tips`.

## Changes

- **stdlib**: removed the tip from `L2Tips`/`L2TipsSchema`/`L2BlockTag`;
replaced the `ChainTip(s)` aliases with
`CheckpointTag`/`CheckpointTagSchema` and direct `L2*` usage; narrowed
`getCheckpointNumber`/`CheckpointParameter` to checkpoint tags; updated
interface tests.
- **archiver**: removed the `proposedCheckpoint` cursor from
`getL2TipsData` (tips cache); rewrote `pruneOrphanProposedBlocks` to
derive the proposed-checkpoint frontier block number from the payload
store; cleaned the trace log and the mocks.
- **sequencer-client**: `checkSync` consumes the proposed payload via
`getProposedCheckpointData()`; **deleted the snapshot-race
reconciliation block** (the inconsistent-proposed-checkpoint
refuse-to-proceed guard) that only existed because tip and payload were
read separately; `hasProposedCheckpoint` is derived from the payload.
Migrated `automine_sequencer` next-checkpoint-number computation.
- **aztec-node**: `getCheckpointNumber` and
`getCheckpoint`/`resolveCheckpointParameter` no longer resolve the
`'proposed'` checkpoint tag; `simulatePublicCalls` derives the
proposed-checkpoint frontier from the payload (no extra block fetch in
the leading case). `getChainTips` simplified.
- **pxe / txe**: deleted the `proposedCheckpoint: tips.checkpointed`
fabrication shims in the PXE node adapter and the TXE archiver.

Fixes A-978
Wait after the scheduled send time until the previous L1 block is
observed before submitting.

Alternative to #24035.

Fixes A-1207
## What

`e2e_fees/account_init` → *"account pays its own fee › pays natively in
the Fee Juice by bridging funds themselves"* flakes while preparing L1
fee juice (before the account deploy), failing on the bridge write:

```
ContractFunctionExecutionError: ... contract function "depositToAztecPublic"
  from:    0xf39Fd6e51aad88F6F4ce6aB8827279cffFb92266
  Details: replacement transaction underpriced
```

CI: http://ci.aztec-labs.com/60b00f39a7a0e3cc

## Why

The fee-juice bridge test harness does direct L1 writes (`mint` →
`approve` → `depositToAztecPublic`) through a plain viem client with
**no nonce manager**, so each write picks its nonce via
`getTransactionCount('pending')`. That client uses Anvil account
`0xf39f…92266` (mnemonic index 0) — which is **also the sequencer
publisher account**. The publisher tracks its own nonce in
`l1-tx-utils`, independently of the test client.

When a test write and a publisher checkpoint tx land on the same nonce
concurrently, the higher-priced publisher tx wins and the test's
lower-priced tx is rejected. In the failed run the publisher was sending
on account #0 at ~95–120 gwei:

```
node:l1-tx-utils Sent L1 transaction 0x9652… {nonce: 67, account: 0xf39f…92266, maxFeePerGas: "95.396…"}
```

while the test's `depositToAztecPublic` went out on the same account at
nonce **60** / ~26 gwei → `replacement transaction underpriced`.

The shared-account race is pre-existing but rarely hit; #24037 (changing
publisher send timing) makes the overlap more likely, which surfaced it.

## Fix

Send the harness's direct L1 writes from a dedicated mnemonic account
(`L1_DIRECT_WRITE_ACCOUNT_INDEX = 1`) instead of the deployer/publisher
account #0. Index 1 is unused by any L1-tx sender in these setups
(publisher = 0, prover = 2, validators = 3+), so the test's writes no
longer share a nonce stream with the publisher. Affects `FeesTest` and
`ClientFlowsBenchmark`.
…PublicCalls (#24031)

## Motivation

`AztecNodeService.simulatePublicCalls` built globals for a fictional
next block using "the L2 slot at the next L1 slot" with no pipelining
offset and no L1 state overrides. Under proposer pipelining this
simulates one slot too early and computes the mana min fee against the
current L1 state, ignoring the gossiped proposed checkpoint that will
land before the simulated block does — producing wrong L1-to-L2 message
sets and wrong fees for wallet-side estimation.

## Approach

The simulation now mirrors how the sequencer picks the build slot and
fee inputs (`sequencer.ts` / `checkpoint_proposal_job.ts` / the
checkpoint simulation overrides builder), split into two cases:

- **Mid-checkpoint continuation** (proposed blocks past the last
checkpoint proposal): every block in a checkpoint shares the same
checkpoint-wide globals, so the next block's globals are copied verbatim
from the latest proposed block header with only the block number bumped.
No L1 reads, no L1-to-L2 message insertion. A missing header fails the
request rather than falling through, since the fork already contains the
ongoing checkpoint's messages and a fall-through would insert them
twice.
- **Opening a new checkpoint**: the target slot is the sequencer's
formula (`next-L1-slot L2 slot + PROPOSER_PIPELINING_SLOT_OFFSET`),
maxed with `proposedCheckpointSlot + 1` when a checkpoint proposal is
pending L1. The same `SimulationOverridesPlan` the sequencer uses is
always applied: parent archive/temp-log/fee-header overrides derived
from the proposed checkpoint when pipelining, tips pinned to the
rollback target when the pending chain is invalid, and tips pinned to
the checkpointed tip otherwise (neutralizing prunes in fee computation).
Both the target slot and the plan derive from a single
`ProposedCheckpointData` read so they cannot disagree about the parent.

The simulator trusts archiver tips verbatim — no staleness guards. Stale
proposed blocks and checkpoints are the archiver's responsibility
(orphan pruning and L1-authoritative eviction), so a stale tip
mis-simulates only transiently.

The logic moves out of `AztecNodeService` into a new
`NodePublicCallsSimulator` class so the slot/globals selection is
unit-testable without standing up a node. Supersedes #23389.

Caching of the L1 fee queries introduced here is tracked separately in
A-1208.

**Structured for commit-by-commit review:**
1. `refactor(aztec-node)`: pure move of the existing
`simulatePublicCalls` code (and its tests) into
`NodePublicCallsSimulator` — no behavior change.
2. `fix(aztec-node)`: the actual behavior changes described above.
3. `refactor`: moves the checkpoint simulation overrides builder from
sequencer-client to `@aztec/stdlib/checkpoint` so the node consumes it
from a shared home.

## API changes

- `GlobalVariableBuilder.buildGlobalVariables` is removed from the
stdlib interface and its implementations (sequencer-client, TXE); slot
selection is now the caller's job and only
`buildCheckpointGlobalVariables` remains.
- `buildCheckpointSimulationOverridesPlan` and
`computePipelinedParentFeeHeader` move from sequencer-client internals
to `@aztec/stdlib/checkpoint`, shared by the sequencer and the node
simulator.
- `AztecNodeService` constructor takes an optional `RollupContract`;
environments that never see a proposed checkpoint or an invalid pending
chain (TXE) may omit it, and the simulator fails loudly if those paths
are reached without it.

## Changes

- **aztec-node**: new `NodePublicCallsSimulator` with the two-case
globals selection and always-on overrides plan; `simulatePublicCalls`
reduced to a thin delegate.
- **aztec-node (tests)**: 12 unit tests covering verbatim mid-checkpoint
continuation, the missing-header guard, the pipelining offset,
parent-slot/override derivation from the proposed checkpoint data,
torn-snapshot fallback, invalid-pending-chain tips pinning, message
insertion semantics, and the rollup-contract invariant.
- **stdlib**: gains `checkpoint/simulation_overrides.ts` (moved from
sequencer-client, with its unit tests); `buildGlobalVariables` removed
from the `GlobalVariableBuilder` interface.
- **sequencer-client**: imports the overrides builder from stdlib;
`buildGlobalVariables` removed from the implementation.
- **txe**: `buildGlobalVariables` removed; TXE node construction passes
no rollup contract.

Fixes A-1063

@ludamad ludamad left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Auto-approved

@AztecBot

Copy link
Copy Markdown
Collaborator Author

🤖 Auto-merge enabled after 4 hours of inactivity. This PR will be merged automatically once all checks pass.

@AztecBot AztecBot enabled auto-merge June 12, 2026 19:37
AztecBot and others added 4 commits June 12, 2026 22:04
…PI (#24066)

Fixes A-1220

## Why

`node.registerContractFunctionSignatures` was an always-on,
unauthenticated write on the main `AztecNode` JSON-RPC surface. It
populates a per-node, in-memory, non-gossiped selector→name map used
only to name public-execution stack traces, so it has no business on a
prod node's public API (anyone can spam it up to the cap, and you'd have
to keep talking to the same node for it to mean anything). Decoding
traces should be a client concern.

## What

Relocate the method from the always-on `AztecNode` interface to the
existing, gated `AztecNodeDebug` interface (served under the
`nodeDebug_` namespace, only mounted when the node runs with debug
endpoints enabled — always on in the in-process sandbox, off by default
on standalone nodes). The node-side read path (`getDebugFunctionName` →
AVM trace naming) is unchanged; only the RPC write moves.

The PXE now calls it through an **optional** `nodeDebug` client and
skips registration entirely when the node does not expose the debug API.
PXE-side simulation errors keep named traces regardless, via the
existing ABI-based enrichment in `pxe/src/error_enriching.ts`.

- **stdlib**: removed `registerContractFunctionSignatures` from
`AztecNode` + `AztecNodeApiSchema`; added it (and the length/count caps)
to `AztecNodeDebug` + `AztecNodeDebugApiSchema`.
- **aztec-node**: server impl unchanged — `AztecNodeService` already
implements `AztecNodeDebug`, so the method now satisfies that interface
and is served under `nodeDebug_`.
- **pxe**: optional `nodeDebug?: AztecNodeDebug` on
`PXECreateArgs`/constructor; both registration call sites become
`this.nodeDebug?.registerContractFunctionSignatures(...)`. Threaded
`options.nodeDebug` through all three `createPXE` entrypoints (option A
— explicit injection).
- **e2e**: wired `nodeDebug` in `end-to-end/src/fixtures/setup.ts` for
both PXE-creation paths — `setup()` passes the in-process node service
directly; `setupPXEAndGetWallet()` passes it via an `isAztecNodeDebug`
type guard (one caller types its node loosely).
- **tests**: moved the schema round-trip case from the node interface
test to the debug interface test; retargeted the PXE registration tests
to the `nodeDebug` mock; updated the `node_rpc_perf` benchmark.
- **docs**: dropped the moved method from the generated node API
reference (the generator only documents `node_`/`nodeAdmin_`), and added
a migration note.

## Breaking change

`node_registerContractFunctionSignatures` is removed from the main node
RPC. Callers must use `nodeDebug_registerContractFunctionSignatures`
against a debug-enabled node. See the migration note.

## Verification

Local build could not be run here: the workspace is a cold checkout
(noir submodule packages absent, no Rust/Docker), so `yarn
install`/`yarn build` cannot bootstrap. The change is a contained
TypeScript interface relocation + optional-client wiring; relying on CI
for the full build/lint/test. A follow-up could wire
`createAztecNodeDebugClient(nodeUrl)` into the URL-based PXE creators
(bot/embedded wallet) if named traces are wanted against a debug-enabled
remote node.

---
*Created by
[claudebox](https://claudebox.work/v2/sessions/b8d3c4fab6c0225a) ·
group: `slackbot`*
## Conflict resolution — review this PR

Stacked on top of the raw merge PR #24071
(`cb/merge-train-spartan-v5-raw`). This PR is the reviewable diff that
resolves the one conflict from merging `v5-next` into
`merge-train/spartan-v5`. Land order when green: this PR → raw PR →
`merge-train/spartan-v5`.

### Conflicts resolved (1)

**`yarn-project/end-to-end/src/bench/client_flows/client_flows_benchmark.ts`**
— import-list divergence:
- Train side imported `L1_DIRECT_WRITE_ACCOUNT_INDEX` (from
`fixtures.js`) and `deployAccounts` (from `setup.js`); v5-next had
neither.
- **Kept** `L1_DIRECT_WRITE_ACCOUNT_INDEX` — it is exported by
`fixtures.ts` (`export const L1_DIRECT_WRITE_ACCOUNT_INDEX = 1`) and
used in the file body (passed to the L1 client at the deployer-account
index, ~line 255).
- **Dropped** `deployAccounts` — it is no longer exported by the merged
`setup.ts` (v5-next refactored it away) and is unused anywhere in the
file body, so importing it would break the build.

No fixtures/generated constants were involved; this is a pure import
resolution. Full TS build is validated by CI on this PR.

### Parents
- train `merge-train/spartan-v5` @ `d979a26`
- base  `v5-next` @ `5fd6187`

---
*Created by
[claudebox](https://claudebox.work/v2/sessions/0e582efb7d4723ed) ·
group: `slackbot`*
…rs) (#24071)

## Raw merge — do not merge alone

Raw `git merge --no-ff` of `v5-next` into `merge-train/spartan-v5` with
conflict markers **committed**. This PR is red by design and exists only
as the base of the stacked resolution PR. Review it together with the
fix PR `cb/merge-train-spartan-v5-fix`.

### Parents
- train `merge-train/spartan-v5` @
`d979a26bc060068ecc33c0b1d88a8dac7e941bef`
- base  `v5-next` @ `5fd61870609b88b14dfcd2924453bd81f77ef0b5`
- merge-base `ab5413c72dc5377107943b8614130ec8050bf06c`

### Conflicted files (1)
-
`yarn-project/end-to-end/src/bench/client_flows/client_flows_benchmark.ts`

Resolution + any regeneration lands in the stacked fix PR.

---
*Created by
[claudebox](https://claudebox.work/v2/sessions/0e582efb7d4723ed) ·
group: `slackbot`*
@PhilWindle PhilWindle disabled auto-merge June 14, 2026 17:16
@PhilWindle PhilWindle enabled auto-merge June 14, 2026 17:17
@PhilWindle PhilWindle added this pull request to the merge queue Jun 14, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 14, 2026
@PhilWindle PhilWindle added this pull request to the merge queue Jun 14, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 14, 2026
PhilWindle and others added 2 commits June 15, 2026 17:10
## Summary

- Keep the shared cross-chain rollup/inbox/outbox client on the deployer
mnemonic account so L2->L1 message consumption signs with the expected
recipient.
- Let L1-writing cross-chain suites opt their harness into
`L1_DIRECT_WRITE_ACCOUNT_INDEX` to avoid nonce races with the sequencer
publisher under pipelining.
- Apply that dedicated harness account to token bridge and L1->L2
message tests.
- Fix Prettier formatting reported by CI.

## Verification

- `git diff --check`
- `./bootstrap.sh ci` (blocked locally: bootstrap reports `Unknown
command: ci`; the container also has no Docker/Redis access)
- `yarn-project/./bootstrap.sh format --check` (blocked locally: missing
`yarn-project/node_modules/.bin/prettier`)
@PhilWindle PhilWindle enabled auto-merge June 15, 2026 18:05
@PhilWindle PhilWindle added this pull request to the merge queue Jun 15, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 15, 2026
…nistic (#24106)

## Why the merge train was dequeued

PR #24053 (`merge-train/spartan-v5` → `v5-next`) was dequeued from the
merge queue because its `merge-queue-heavy` CI run failed. Tracing the
failure:

- CI3 merge_group run `27568575541` → shard `x4-full`
(`ci-full-no-test-cache`) failed.
- The failing test was
`yarn-project/world-state/src/native/native_world_state.test.ts` (`code:
1`), specifically:
> `NativeWorldState › Pending and Proven chain › does not fail when a
delayed-close fork is destroyed by a reorg before its close fires`
- Failing assertion (`native_world_state.test.ts:966`):
  ```
expect((ws as any).instance.queues.has(forkId)).toBe(false); //
Expected: false, Received: true
  ```
The `warnSpy` assertion on the preceding line passed; only the
queue-cleanup assertion failed.

This test and the world-state instance/facade code it exercises are
**not modified by this train** — the test already exists unchanged in
`v5-next`. So this is a **pre-existing timing flake** that happened to
fail in this grind and blocked the train.

## Root cause

The delayed-close path is correct and always eventually cleans up:

- `asyncDispose()` schedules `sleep(closeDelayMs).then(() => close())`
(fire-and-forget).
- `close()` → `doClose()` → `instance.call(DELETE_FORK)`, and the
per-fork queue entry is deleted in `call()`'s `finally` regardless of
whether the native side rejects with `Fork not found` (which it does
here, because the reorg/unwind already destroyed the native fork).

The test then waited a **fixed** `sleep(closeDelayMs * 3)` (3s) before
asserting the queue was gone. Under a loaded merge-queue grind, the
scheduled close (fired at +`closeDelayMs`) plus its async `DELETE_FORK`
round-trip and `queue.stop()` can take longer than that fixed margin, so
the queue entry was still present when the assertion ran — Received
`true`.

## Fix

Replace the racy fixed sleep with a deterministic wait on the exact
asserted condition:

```ts
await retryUntil(() => !(ws as any).instance.queues.has(forkId), 'delayed-close fork queue cleanup', 30, 0.1);
```

This polls until the queue entry is removed (the cleanup runs inside the
same close path), with a 30s timeout. It preserves the regression
intent: if cleanup genuinely stopped happening, `retryUntil` throws a
clear `Timeout awaiting delayed-close fork queue cleanup` instead of a
bare `toBe(false)`. The production code is unchanged; no test is skipped
or weakened. The now-unused `sleep` import is swapped for `retryUntil`.

## Verification

This is a test-only, single-statement timing change. I was unable to run
`./bootstrap.sh ci` in this environment — the workspace has no
`node_modules` and no barretenberg/native `world_state_napi` build
artifacts, so exercising this native test would require a multi-hour C++
+ TS build that isn't proportionate to a wait-deterministically change.
The change is type-checked by inspection: `retryUntil(fn, name,
timeoutSec, intervalSec)` matches the foundation signature, and the
import path (`@aztec/foundation/retry`) is the one used across
yarn-project. CI on this PR will validate it.

---
*Created by
[claudebox](https://claudebox.work/v2/sessions/6fb41d03fbbd7aaa) ·
group: `slackbot`*
AztecBot and others added 2 commits June 16, 2026 00:13
## Summary

- Investigated the `merge-train/spartan-v5` dequeue for train PR #24053.
- Latest dequeue was from merge-group CI, not a merge conflict: #24053
is currently mergeable/clean on `merge-train/spartan-v5`.
- The failing merge-group run was `27576008840` on
`gh-readonly-queue/v5-next/pr-24053-fc2322d6...`, failing `ci/x3-full`
via `yarn-project/kv-store/scripts/run_test.sh
src/bench/sqlite-opfs-encrypted/map_bench.test.ts`.
- Root cause: skipped-by-default browser benchmarks still imported
SQLite/OPFS and shared benchmark dependencies at module load, so normal
CI test discovery could fail before the `describe.skip` path.
- Fix: lazy-load benchmark dependencies only when `VITE_BENCH=1` for the
IndexedDB, SQLite-OPFS, and encrypted SQLite-OPFS map benchmarks.

Failed dashboard log: http://ci.aztec-labs.com/dbe70cb905f2df69

## Verification

- `CI=1 PATH="$HOME/.cargo/bin:$PATH" scripts/run_test.sh
src/bench/sqlite-opfs-encrypted/map_bench.test.ts` from
`yarn-project/kv-store` passes with the benchmark skipped.
- `CI=1 PATH="$HOME/.cargo/bin:$PATH" scripts/run_test.sh
src/bench/sqlite-opfs/map_bench.test.ts` passes with the benchmark
skipped.
- `CI=1 PATH="$HOME/.cargo/bin:$PATH" scripts/run_test.sh
src/bench/indexeddb/map_bench.test.ts` passes with the benchmark
skipped.
- `yarn format kv-store --check` passes.

Notes: the requested root `./bootstrap.sh ci` is not a valid bootstrap
command in this checkout (`Unknown command: ci`). I also tried the
apparent CI equivalent, `./bootstrap.sh ci-full-no-test-cache`; it did
not complete in this container because the local CI engine/tooling
failed before reaching these tests.

---
*Created by
[claudebox](https://claudebox.work/v2/sessions/72862d04194777fa) ·
group: `slackbot`*
@PhilWindle PhilWindle enabled auto-merge June 16, 2026 08:36
@PhilWindle PhilWindle added this pull request to the merge queue Jun 16, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 16, 2026
@PhilWindle PhilWindle added this pull request to the merge queue Jun 16, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 16, 2026
@PhilWindle PhilWindle enabled auto-merge June 16, 2026 10:02
@PhilWindle PhilWindle added this pull request to the merge queue Jun 16, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 16, 2026
…6) (#24041)

## Motivation

When a node receives a block proposal whose index within its checkpoint
exceeds the per-checkpoint block limit, it rejects the gossip message
and penalizes the relaying peer. But an oversized checkpoint is produced
by the proposer, not the relay: the proposer should be slashed, and the
honest relay left alone. Because the proposal was rejected at ingress,
it was never stored nor re-broadcast, so no other node could observe the
evidence and a slash for this offense could never fire (slashing is
social-consensus voting and needs enough validators to independently see
the offense, as with equivocation).

## Approach

Mirrors how equivocation slashing works today.

- At gossip validation, only indices at or beyond the structural ceiling
`MAX_ATTESTABLE_BLOCKS_PER_CHECKPOINT` (72) are rejected with a peer
penalty. Indices in `[maxBlocksPerCheckpoint, 72)` are structurally
valid proposer misbehavior and pass validation. Proposer/signature/slot
checks still run first, so only the slot's legitimate proposer can get a
proposal this far.
- Oversized proposals ride the regular attestation-pool store path, so
they are retained and re-broadcast as slashing evidence with storage and
amplification bounded by the existing per-position and per-slot pool
caps, and equivocation detection keeps working at oversized indices. The
gossip handlers never process or attest to an oversized proposal:
`isOversized` is computed once during validation and threaded through
the validation-result metadata alongside `isEquivocated`. Pool-cap
violations keep their relay penalty regardless of oversizedness — relays
must respect the caps even when forwarding slashing evidence.
- When the p2p service stores an oversized proposal it invokes a new
oversized-proposal callback (the same pattern as the duplicate-proposal
callback for equivocation, covering both the standalone block and the
checkpoint-embedded gossip paths). The validator client handles it by
emitting `BROADCASTED_INVALID_BLOCK_PROPOSAL` with the existing
`slashBroadcastedInvalidBlockPenalty`, deduped per (proposer, slot): a
signed block proposal at an illegal index is invalid in itself, no
checkpoint proposal needed. The validator-client checkpoint-level
`too_many_blocks_in_checkpoint` check stays as defense-in-depth, though
it cannot fire for the gossip case since oversized proposals are never
processed.
- When `maxBlocksPerCheckpoint` is undefined (local/test setups), the
oversized paths no-op and only the structural ceiling applies.

## Changes

- **p2p**: `ProposalValidator.validateBlockIndexWithinCheckpoint`
rejects only at the structural ceiling; `LibP2PService` threads
`isOversized` through validation metadata, skips processing/attestation
for oversized block and checkpoint proposals, and reports stored
oversized proposals through a new `registerOversizedProposalCallback`.
- **validator-client**: registers the oversized-proposal callback and
emits the `BROADCASTED_INVALID_BLOCK_PROPOSAL` offense for the proposer,
deduped per (proposer, slot).
- **tests**: cover the new accept window, store/re-broadcast/no-process
invariants on both gossip paths, duplicate and equivocating oversized
proposals, pool-cap rejections, callback firing per stored proposal,
offense emission and dedup, and undefined-config no-op.

An e2e exercising the full slash flow is left as a follow-up.

Fixes A-1166
## Human-written summary

Fixes three potential issues in the block-stream:

1. We snapshot the source tips, then download blocks until the proposed
tip and pass them to the consumer, and then emit the checkpointed and
proven tips. If there's a prune while we're downloading blocks, we may
not be able to fetch blobs until the proposed tip. And worse, we may be
then emitting checkpointed and proven events that reference blocks _we
could never pass onto the consumer_. This PR fixes it by aborting all
events if we fail to collect all the blocks we expected, and waiting
until the next iteration to heal.

2. When we detect a prune, we walk back to find the common ancestor. But
that walk back may take time. During that walk back, the checkpointed
and proven tips may have shifted, so we re-fetch them to advertise the
proper latest tips.

3. When walking back to find the common ancestor during a prune, we
compare block hashes of source vs consumer. If the consumer didn't see a
block for whatever reason, but that block was also removed from the
source, then both block hashes compared are undefined, and we stop the
walk-back, instead of continuing until finding an actual existing common
ancestor.

4. When walking back, if the consumer for some reason did not have the
block hashes available, a walk back to genesis would be triggered. We
now cap the walk back to the local finalized chain tip.

## Motivation

This PR hardens the block-mode `L2BlockStream` against three
pre-existing hazards. It is split out of
#24034 (the `chain-proposed`/`tips-only` feature work) so the
correctness fixes can be reviewed on their
own — #24034's `chain-proposed` emission depends on the pass-completion
gate introduced here, so this PR
sits below it in the stack.

The three hazards, all latent in block mode today:

- **Tier cursors advance from an incomplete download pass.** The
download loop broke on an empty
`getBlocks` result and end-of-pass reconciliation still ran, so
`chain-checkpointed`/`chain-proven`/
`chain-finalized` could reference blocks the consumer never received via
`blocks-added` — exactly the
  transient-inconsistency class #24007 set out to remove.
- **`chain-pruned` carries stale clamp tips.** The prune event's
`checkpointed`/`proven` payload came from
the pass snapshot, which on a stale pass describes the chain *before*
the prune. p2p feeds
`event.checkpointed.checkpoint` into `isEpochPrune`, whose contract is
"the checkpoint id **after** the
prune", and the result drives the irreversible `deleteAllTxs` mempool
wipe — a misclassification risk.
- **The walk-back stops above the true divergence.**
`areBlockHashesEqualAt` ended with
`localHash === sourceHash`, so a height missing on both sides
(`undefined === undefined`) counted as
agreement. With a store that has gaps in its hash index (p2p keeps none
below its `startingBlock`), a gap
meeting a source-pruned height stopped the walk above the divergence —
an under-deep prune that left
  old-fork state in place with no later re-detection.

## Approach

- **Pass atomicity.** Extract the block download loop into
`downloadBlocks()`, which returns whether the
download plan completed; tier reconciliation is skipped for the pass
when it did not. Two staleness
terminations: an empty `getBlocks` below the target, and a delivered
proposed block whose hash differs
from the snapshot's (a mid-pass same-height fork swap). Intentional
skips — the loop never running
because `startingBlock`/caught-up put the cursor past the proposed tip —
are trivially complete and still
  reconcile, preserving the A-1061 fix.
- **Fresh reorg payload.** On walk-back divergence, re-read `getL2Tips`
and drive the `chain-pruned` clamp
tips, the #13471 prune-target clamp, the download target, and tier
reconciliation from the re-read rather
than the pre-prune snapshot. The pass does not abort if the re-read's
proposed tip moved on (a busy chain
  advances every pass and prune emission would otherwise starve).
- **Walk-back missing-local-hash rule.** A missing **local** hash now
compares unequal regardless of the
source side, so the walk continues to a recorded matching height or
genesis (block 0 always resolves via
the store's `initialBlockHash`). Prunes become over-deep-only, which
consumers tolerate. The
  `skipFinalized` AbortError path is unchanged.
- The walk-back hash cache is seeded with the **proposed tip only**.
Seeding the tier tips would let a
snapshot entry that went stale (a reorg between the snapshot and the
walk) fake agreement at a reorged
height and stop the walk early; the re-read appends only its proposed
tip after the walk is over.

**Behavior change worth calling out:** a *permanently* missing source
range previously advanced tier
cursors past undelivered blocks (masking the source bug). With pass
atomicity it now wedges loudly — a
warn each pass, p2p stays SYNCHING — instead of silently diverging. Loud
failure is the better behavior.

## Changes

- `stdlib/src/block/l2_block_stream/l2_block_stream.ts`:
`downloadBlocks()` extraction + completion gate;
post-divergence `getL2Tips` re-read driving the prune
payload/target/download/reconciliation; missing
  local-hash → unequal; proposed-only cache seeding.
- `stdlib/src/block/l2_block_stream/l2_block_stream.test.ts`:
pass-atomicity terminations (empty
`getBlocks`, delivered-hash mismatch, A-1061 fast-forward still
reconciles), prune-payload freshness,
re-read download target + reconciliation, and the walk-back
both-undefined / stale tier-seed regressions.
@PhilWindle PhilWindle enabled auto-merge June 16, 2026 11:27
@PhilWindle PhilWindle added this pull request to the merge queue Jun 16, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 16, 2026
…24130)

## Summary

- Fixes the merge-queue failure on PR #24053 by making
`CheckpointStore`'s pruning test use explicit slots for its synthetic
checkpoints.
- The failed CI run was merge-group run `27616214374`; `CI3` job
`81652669415` failed in `x9-full`
(`http://ci.aztec-labs.com/11cc1d3bfd3986cf`).
- The failing test was `yarn-project/scripts/run_test.sh
prover-node/src/checkpoint-store.test.ts`, log
`http://ci.aztec-labs.com/0e7cfabe067f8bed`, where `markPrunedAboveBlock
marks every prover holding a block above the target and returns them`
hit the stricter same-slot canonical checkpoint guard while building
random test data.

## Verification

- `git diff --check`
- `./bootstrap.sh ci` attempted, but this branch's root bootstrap has no
`ci` target and the container has no Docker socket.
- `./bootstrap.sh ci-full-no-test-cache` attempted to match the failed
merge-queue leg, but local bootstrap failed before project tests because
the test engine exited and interrupted the parallel build.
- Focused Yarn/Jest verification is blocked in this checkout because
`yarn-project` portal dependencies under `noir/packages/*` are not
generated; `noir/bootstrap.sh` is blocked by a crates.io DNS/proxy
failure while installing `just`.

---
*Created by
[claudebox](https://claudebox.work/v2/sessions/46e781f113c5bc56) ·
group: `slackbot`*
@PhilWindle PhilWindle added this pull request to the merge queue Jun 16, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 16, 2026
@AztecBot

AztecBot commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator Author

Flakey Tests

🤖 says: This CI run detected 1 tests that failed, but were tolerated due to a .test_patterns.yml entry.

\033FLAKED\033 (8;;http://ci.aztec-labs.com/ce6bd5a3a5183b01�ce6bd5a3a5183b018;;�):  yarn-project/end-to-end/scripts/run_test.sh simple src/e2e_epochs/epochs_optimistic_proving.parallel.test.ts "removes a checkpoint mid-epoch via reorg and proves with survivors" (231s) (code: 0) group:e2e-p2p-epoch-flakes

Merged via the queue into v5-next with commit 5fd0d02 Jun 16, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants