Fix resume when local metadata branch is stale + preserve diverged refs#1252
Open
Soph wants to merge 3 commits into
Open
Fix resume when local metadata branch is stale + preserve diverged refs#1252Soph wants to merge 3 commits into
Soph wants to merge 3 commits into
Conversation
promoteRemoteTrackingMetadataBranch returned early whenever the local entire/checkpoints/v1 ref existed, even when origin/entire/checkpoints/v1 was ahead. The committed-checkpoint store consulted by RestoreLogsOnly and resumeSingleSession only falls back to origin/... when the local ref is missing entirely, so `entire resume` printed "session log not available" for checkpoints already present in refs/remotes/origin/entire/checkpoints/v1. Drop the early-return so SafelyAdvanceLocalRef runs unconditionally — it no-ops when local is at or ahead of the target (preserving any unpushed local commits) and fast-forwards when behind. Same pattern FetchMetadataBranch already uses after a real fetch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 6372b938a1f7
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes entire resume behavior when the local metadata branch (entire/checkpoints/v1) exists but is stale relative to refs/remotes/origin/entire/checkpoints/v1, by ensuring the local ref is safely advanced before reading committed checkpoint metadata.
Changes:
- Removed the early return in
promoteRemoteTrackingMetadataBranchso it always attempts to advance the local metadata ref to the remote-tracking hash (while avoiding rewinds when local is ahead). - Added unit coverage for fast-forwarding a stale local metadata ref.
- Added an end-to-end regression test covering
resumeFromCurrentBranchwhen local metadata is behind remote-tracking metadata.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| cmd/entire/cli/resume.go | Always promotes/advances the local metadata ref from origin/... before resume metadata reads to avoid “session log not available” when remote-tracking is ahead. |
| cmd/entire/cli/resume_test.go | Adds targeted and end-to-end tests reproducing the stale-local-metadata resume failure and validating the fix. |
- Extract makeLocalMetadataBranchStale + readMetadataBranchHash helpers
to dedupe the "advance local, mirror to remote, rewind local" setup
shared by both new tests.
- Replace types.AgentType("Claude Code") magic string with the existing
agent.AgentTypeClaudeCode constant.
- Trim the 8-line docstring on promoteRemoteTrackingMetadataBranch and
drop inline test comments that narrated WHAT obvious code does;
keep the non-obvious WHY (the bug class).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 857472164f84
5 tasks
The previous contract overwrote any local ref the target couldn't reach, which silently discarded unpushed local commits whenever a fetch landed sibling commits — e.g. checkpoint metadata produced on another machine sharing the same orphan-style refs (entire/checkpoints/v1, the V2 main ref). The objects survived in the loose-objects pool until git gc, but the branch ref no longer pointed at them. Tighten the helper so it only sets the ref when the move is non-destructive: missing → create, local at/ahead → no-op (existing protection), local strictly behind → fast-forward, diverged or unrelated history → no-op with a debug log. All three production callers (promoteRemoteTrackingMetadataBranch, FetchMetadataBranch, PromoteTmpRefSafely) sync orphan branches where this is the correct semantic; resume falls through to remote-tracking-tree reads in the rare divergent case. Update TestFetchV2MainFromURL_UpdatesExistingRef to use the existing advanceV2MainOnTop helper — the prior setup advanced the remote via a second call to createV2MainRef, which always produces an unrelated orphan commit. That setup was only "updating" via the now-removed diverged-overwrite path; a descendant commit on top of the previous tip is what the real condensation flow produces. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 4197ff1b1da6
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two related fixes for
entire resumeand the underlying ref-advance helper. They compose: the second hardens the helper so the first is fully safe in every scenario.1. Fix the user-visible bug (
resume.go)promoteRemoteTrackingMetadataBranchreturned early whenever the localentire/checkpoints/v1ref existed — even whenorigin/entire/checkpoints/v1was ahead. The committed-checkpoint store used byRestoreLogsOnlyandresumeSingleSessiononly falls back torefs/remotes/origin/...when the local ref is missing, soentire resumeprinted "session log not available" for checkpoints already in the remote-tracking ref.Drop the early return so
SafelyAdvanceLocalRefruns unconditionally — same patternFetchMetadataBranchalready uses after a real fetch.2. Protect unpushed local commits in
SafelyAdvanceLocalRef(strategy/common.go)The previous contract overwrote any local ref the target couldn't reach, which silently discarded unpushed local commits whenever a fetch landed sibling commits — e.g. checkpoint metadata produced on another machine sharing the same orphan-style refs (
entire/checkpoints/v1, the V2 main ref). The objects survived in the loose-objects pool until git gc, but the branch ref no longer pointed at them.Tighten the helper so it only sets the ref when the move is non-destructive:
All three production callers (
promoteRemoteTrackingMetadataBranch,FetchMetadataBranch,PromoteTmpRefSafely) sync orphan branches where this is the correct semantic. In the rare divergent case, resume falls through to remote-tracking-tree reads (v1) or the DualCheckpointReader → V1 fallback.TestFetchV2MainFromURL_UpdatesExistingRefwas updated to use the existingadvanceV2MainOnTophelper — the prior setup advanced via a second call tocreateV2MainRef(always an unrelated orphan commit), which was only "working" via the now-removed diverged-overwrite path. The realistic CLI flow produces a descendant commit on top of the previous tip.Commits
Fix resume when local metadata branch is stale— the immediate user-visible fixSimplify resume tests after review— dedupe + use existingagent.AgentTypeClaudeCodeconstantPreserve diverged local refs in SafelyAdvanceLocalRef— the broader hardeningTest plan
mise run checkpasses (fmt, lint, unit, integration, canary)resume_test.go:TestPromoteRemoteTrackingMetadataBranch_FastForwardsStaleLocalTestResumeFromCurrentBranch_FastForwardsStaleLocalMetadata(reproduces the user-visible "session log not available" message)strategy/safely_advance_local_ref_test.go:TestSafelyAdvanceLocalRef_Diverged_PreservesLocalTestSafelyAdvanceLocalRef_UnrelatedHistory_PreservesLocalTestFetchMetadataBranch_DoesNotRewindLocalAheadstill passesTestFetchV2MainFromURL_DoesNotRewindLocalAheadstill passesentire/checkpoints/v1is behind origin,entire resume <branch>restores the session instead of printing "session log not available"🤖 Generated with Claude Code