Skip to content

Fix resume when local metadata branch is stale + preserve diverged refs#1252

Open
Soph wants to merge 3 commits into
mainfrom
soph/fix-resume-stale-local-metadata-branch
Open

Fix resume when local metadata branch is stale + preserve diverged refs#1252
Soph wants to merge 3 commits into
mainfrom
soph/fix-resume-stale-local-metadata-branch

Conversation

@Soph
Copy link
Copy Markdown
Collaborator

@Soph Soph commented May 22, 2026

Summary

Two related fixes for entire resume and the underlying ref-advance helper. They compose: the second hardens the helper so the first is fully safe in every scenario.

1. Fix the user-visible bug (resume.go)

promoteRemoteTrackingMetadataBranch returned early whenever the local entire/checkpoints/v1 ref existed — even when origin/entire/checkpoints/v1 was ahead. The committed-checkpoint store used by RestoreLogsOnly and resumeSingleSession only falls back to refs/remotes/origin/... when the local ref is missing, so entire resume printed "session log not available" for checkpoints already in the remote-tracking ref.

Drop the early return so SafelyAdvanceLocalRef runs unconditionally — same pattern FetchMetadataBranch already uses after a real fetch.

2. Protect unpushed local commits in SafelyAdvanceLocalRef (strategy/common.go)

The previous contract overwrote any local ref the target couldn't reach, which silently discarded unpushed local commits whenever a fetch landed sibling commits — e.g. checkpoint metadata produced on another machine sharing the same orphan-style refs (entire/checkpoints/v1, the V2 main ref). The objects survived in the loose-objects pool until git gc, but the branch ref no longer pointed at them.

Tighten the helper so it only sets the ref when the move is non-destructive:

  • missing → create
  • local at/ahead of target → no-op (existing protection)
  • local strictly behind target → fast-forward
  • diverged or unrelated history → no-op with a debug log

All three production callers (promoteRemoteTrackingMetadataBranch, FetchMetadataBranch, PromoteTmpRefSafely) sync orphan branches where this is the correct semantic. In the rare divergent case, resume falls through to remote-tracking-tree reads (v1) or the DualCheckpointReader → V1 fallback.

TestFetchV2MainFromURL_UpdatesExistingRef was updated to use the existing advanceV2MainOnTop helper — the prior setup advanced via a second call to createV2MainRef (always an unrelated orphan commit), which was only "working" via the now-removed diverged-overwrite path. The realistic CLI flow produces a descendant commit on top of the previous tip.

Commits

  1. Fix resume when local metadata branch is stale — the immediate user-visible fix
  2. Simplify resume tests after review — dedupe + use existing agent.AgentTypeClaudeCode constant
  3. Preserve diverged local refs in SafelyAdvanceLocalRef — the broader hardening

Test plan

  • mise run check passes (fmt, lint, unit, integration, canary)
  • Failing-before-fix tests in resume_test.go:
    • TestPromoteRemoteTrackingMetadataBranch_FastForwardsStaleLocal
    • TestResumeFromCurrentBranch_FastForwardsStaleLocalMetadata (reproduces the user-visible "session log not available" message)
  • Failing-before-fix tests in strategy/safely_advance_local_ref_test.go:
    • TestSafelyAdvanceLocalRef_Diverged_PreservesLocal
    • TestSafelyAdvanceLocalRef_UnrelatedHistory_PreservesLocal
  • Existing TestFetchMetadataBranch_DoesNotRewindLocalAhead still passes
  • Existing TestFetchV2MainFromURL_DoesNotRewindLocalAhead still passes
  • Manual: in a real repo where local entire/checkpoints/v1 is behind origin, entire resume <branch> restores the session instead of printing "session log not available"

🤖 Generated with Claude Code

promoteRemoteTrackingMetadataBranch returned early whenever the local
entire/checkpoints/v1 ref existed, even when origin/entire/checkpoints/v1
was ahead. The committed-checkpoint store consulted by RestoreLogsOnly
and resumeSingleSession only falls back to origin/... when the local ref
is missing entirely, so `entire resume` printed "session log not
available" for checkpoints already present in
refs/remotes/origin/entire/checkpoints/v1.

Drop the early-return so SafelyAdvanceLocalRef runs unconditionally — it
no-ops when local is at or ahead of the target (preserving any unpushed
local commits) and fast-forwards when behind. Same pattern
FetchMetadataBranch already uses after a real fetch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 6372b938a1f7
@Soph Soph requested a review from a team as a code owner May 22, 2026 18:51
Copilot AI review requested due to automatic review settings May 22, 2026 18:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes entire resume behavior when the local metadata branch (entire/checkpoints/v1) exists but is stale relative to refs/remotes/origin/entire/checkpoints/v1, by ensuring the local ref is safely advanced before reading committed checkpoint metadata.

Changes:

  • Removed the early return in promoteRemoteTrackingMetadataBranch so it always attempts to advance the local metadata ref to the remote-tracking hash (while avoiding rewinds when local is ahead).
  • Added unit coverage for fast-forwarding a stale local metadata ref.
  • Added an end-to-end regression test covering resumeFromCurrentBranch when local metadata is behind remote-tracking metadata.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
cmd/entire/cli/resume.go Always promotes/advances the local metadata ref from origin/... before resume metadata reads to avoid “session log not available” when remote-tracking is ahead.
cmd/entire/cli/resume_test.go Adds targeted and end-to-end tests reproducing the stale-local-metadata resume failure and validating the fix.

Comment thread cmd/entire/cli/resume.go Outdated
- Extract makeLocalMetadataBranchStale + readMetadataBranchHash helpers
  to dedupe the "advance local, mirror to remote, rewind local" setup
  shared by both new tests.
- Replace types.AgentType("Claude Code") magic string with the existing
  agent.AgentTypeClaudeCode constant.
- Trim the 8-line docstring on promoteRemoteTrackingMetadataBranch and
  drop inline test comments that narrated WHAT obvious code does;
  keep the non-obvious WHY (the bug class).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 857472164f84
The previous contract overwrote any local ref the target couldn't reach,
which silently discarded unpushed local commits whenever a fetch landed
sibling commits — e.g. checkpoint metadata produced on another machine
sharing the same orphan-style refs (entire/checkpoints/v1, the V2 main
ref). The objects survived in the loose-objects pool until git gc, but
the branch ref no longer pointed at them.

Tighten the helper so it only sets the ref when the move is
non-destructive: missing → create, local at/ahead → no-op (existing
protection), local strictly behind → fast-forward, diverged or
unrelated history → no-op with a debug log. All three production
callers (promoteRemoteTrackingMetadataBranch, FetchMetadataBranch,
PromoteTmpRefSafely) sync orphan branches where this is the correct
semantic; resume falls through to remote-tracking-tree reads in the
rare divergent case.

Update TestFetchV2MainFromURL_UpdatesExistingRef to use the existing
advanceV2MainOnTop helper — the prior setup advanced the remote via a
second call to createV2MainRef, which always produces an unrelated
orphan commit. That setup was only "updating" via the now-removed
diverged-overwrite path; a descendant commit on top of the previous
tip is what the real condensation flow produces.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 4197ff1b1da6
@Soph Soph changed the title Fix resume when local metadata branch is stale Fix resume when local metadata branch is stale + preserve diverged refs May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants