Fix worktree index seeding#170
Conversation
The lumen index command (the background indexer spawned at SessionStart) created the DB and re-indexed every file from scratch instead of copying an existing sibling-worktree index — only the MCP getOrCreate seeded, and it lost the race to create the DB. runIndexer now seeds from a donor, under the index lock it already holds, before indexing. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
getOrCreate seeded and created a new DB without the index flock that lumen index holds, so both could run SeedFromDonor against the same fresh worktree DB concurrently and corrupt it. getOrCreate now takes the same lock to seed; when a peer holds it, it waits briefly for the peer to publish the DB instead of creating an empty one that clobbers the seed. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThe PR adds cross-process coordination for creating index databases in sibling worktrees, preventing race conditions between main and background indexer processes. Background indexer seeds from a donor worktree under lock; main process acquires the same lock to seed or defers by waiting for peer. ChangesCross-process Seeding Coordination
🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
cmd/stdio.go (1)
536-542: ⚡ Quick winDistinguish "lock held by peer" from a
TryAcquireerror.The
defaultbranch fires both when the peer holds the lock (lockErr == nil && lk == nil) and whenTryAcquireactually errored (lockErr != nil). In the error case the code silently waits the fullcreateWaitTimeoutfor a DB that may never be published, then creates an empty one — and the underlying lock error is dropped. At minimum loglockErrso flock failures are diagnosable.♻️ Log the lock-acquire error before falling back to waiting
default: + if lockErr != nil { + ic.logger().Warn("acquire index lock for seeding failed; waiting for peer DB", + "db_path", dbPath, "error", lockErr) + } // A background indexer holds the lock and is creating + seeding the // DB. Wait briefly for it to publish the file so NewIndexer opens // the seeded copy instead of creating an empty DB that would clobber // the seed. ic.waitForDB(dbPath)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cmd/stdio.go` around lines 536 - 542, The default branch conflates "lock held by peer" and genuine TryAcquire failures; before calling ic.waitForDB(dbPath), detect if lockErr != nil and log the error (including dbPath and lockErr) using the component's logger so TryAcquire failures are visible; keep the existing fallback wait behavior (ic.waitForDB) for the peer-held case but ensure TryAcquire errors are not silently dropped by emitting a clear log entry referencing TryAcquire, lockErr, dbPath, and the index controller (ic) context.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@cmd/stdio.go`:
- Around line 536-542: The default branch conflates "lock held by peer" and
genuine TryAcquire failures; before calling ic.waitForDB(dbPath), detect if
lockErr != nil and log the error (including dbPath and lockErr) using the
component's logger so TryAcquire failures are visible; keep the existing
fallback wait behavior (ic.waitForDB) for the peer-held case but ensure
TryAcquire errors are not silently dropped by emitting a clear log entry
referencing TryAcquire, lockErr, dbPath, and the index controller (ic) context.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 741905fc-0864-4ecb-ac6d-cff7d6747063
📒 Files selected for processing (5)
cmd/index.gocmd/seed.gocmd/seed_test.gocmd/stdio.gocmd/stdio_seedrace_test.go
|
@aeneasr Do you want me to address the refactoring coderabbitai pointed out? Seems too broad a change for this PR. |
Working in a fresh git worktree of an already-indexed repo re-indexed everything from scratch instead of reusing the parent's embeddings. While fixing that, a cross-process race in the seeding path also surfaced. This PR addresses both.
Fixes #169
1. lumen index never copied a sibling worktree's index
Seeding an existing sibling-worktree index into a fresh worktree only happened in the MCP server's getOrCreate. But lumen index created the DB and re-indexed every file from scratch, never consulting FindDonorIndex/SeedFromDonor. Since that background process almost always wins the race to create the DB, getOrCreate's own seeding was then skipped (the DB already existed), so a worktree whose parent was fully indexed still paid the entire embedding cost again.
Fix: runIndexer now seeds from a donor when the DB doesn't yet exist — under the index lock it already holds, so only one indexer seeds — then lets the normal incremental EnsureFresh re-index only the changed files.
2. MCP seeding raced the background indexer
getOrCreate seeded and created a new index DB without holding the index flock that lumen index acquires. So the MCP server and the SessionStart-spawned background indexer could create the same fresh worktree DB concurrently: both run SeedFromDonor through a shared temp path and rename it over the DB (corrupting the SQLite file), or one creates an empty DB that makes the other skip seeding and re-index from scratch. The window is narrow — only the first index of a worktree — but the SessionStart directive that tells the agent to search first makes it reachable, and fix #1 (making lumen index seed too) widened the seed-vs-seed collision.
Fix: getOrCreate now acquires the same flock before seeding. When it wins, it seeds under the lock; when a peer holds it, it waits briefly (bounded by createWaitTimeout) for the peer to publish the DB so NewIndexer opens the seeded copy instead of creating an empty one that clobbers the seed.
Testing
Summary by CodeRabbit
Release Notes
New Features
Tests