Skip to content

feat(purge): rework index cleanup with cwd default and new flags#172

Open
Ismael wants to merge 2 commits into
ory:mainfrom
Ismael:feat-purge-rework
Open

feat(purge): rework index cleanup with cwd default and new flags#172
Ismael wants to merge 2 commits into
ory:mainfrom
Ismael:feat-purge-rework

Conversation

@Ismael
Copy link
Copy Markdown

@Ismael Ismael commented May 30, 2026

Reworks lumen purge to be safe-by-default and adds explicit modes for broader cleanup. Previously, lumen purge with no arguments wiped every index. Now no-args purges only the current project, and destructive/bulk operations are opt-in behind flags.

Invocation Effect
lumen purge Removes only the current working directory's index (path normalized to its git root).
lumen purge <path...> Removes the index directories for the given projects (unchanged).
lumen purge --all Removes every index, including legacy indexes that predate project_path metadata.
lumen purge --missing Removes indexes whose recorded project folder no longer exists on disk. Deletes only when the folder is confirmed missing.
lumen purge --missing --dry-run Previews what --missing would remove without deleting.
Mutually exclusive combinations are rejected with clear errors (e.g. --all + paths, --dry-run without --missing).

Notes

  • --missing is conservative: any stat error other than "not exist" skips the index rather than risking deletion, so a transient filesystem error never wipes a live index.
  • Each removed index is logged individually (with its project_path where known); --all also logs legacy dirs by path.
  • The data-dir scan (scanIndexes) is shared across all modes; it now also surfaces legacy (metadata-less) dirs so --all can log them, while path/--missing modes ignore them.
  • README and purge --help text updated to document all modes.

Closes #166

Summary by CodeRabbit

  • New Features

    • Purge command adds --all, --missing, --legacy and --dry-run modes; default no-arg purge now targets only the current directory.
    • Purge now reports which index directories will be removed before deletion.
  • Documentation

    • README updated to document the new purge modes as safer alternatives to deleting the lumen directory.
  • Tests

    • Expanded tests to cover flag validation, --missing/--legacy behaviors, and dry-run preview logging.

Change "lumen purge" with no arguments to remove only the current
project's index (normalized to its git root) instead of wiping
everything, and add explicit modes for the broader operations:

  --all       Remove every index, including legacy indexes that predate
              project_path metadata.
  --missing   Remove indexes whose recorded project folder no longer
              exists; only deletes when the folder is confirmed missing.
  --dry-run   With --missing, preview removals without deleting.

Each removed index is logged individually, mutually exclusive flag
combinations are rejected, and the data-dir scan is shared across all
modes. README and command help updated accordingly.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 30, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 9dbce66f-7335-42de-a3d4-2655ca45e89f

📥 Commits

Reviewing files that changed from the base of the PR and between 7d51809 and a924d20.

📒 Files selected for processing (2)
  • cmd/purge.go
  • cmd/purge_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • cmd/purge.go

📝 Walkthrough

Walkthrough

This PR expands the lumen purge command with three operational modes: --all purges every index, --missing removes indexes for deleted source folders, and --dry-run previews removals without deletion. The implementation includes flag definitions, validation, purge mode dispatch, refactored index scanning, comprehensive test coverage, and user documentation.

Changes

Purge Command Mode Expansion

Layer / File(s) Summary
Flag definitions and command setup
cmd/purge.go
Purge flag names are declared as constants, registered via registerPurgeFlags, and the purge command help text is extended to document --all, --missing, --legacy, and --dry-run.
Core purge infrastructure and scan refactor
cmd/purge.go
runPurge is refactored to read/validate flags, default no-arg behavior to the current working directory, dispatch modes, and scanIndexes now classifies indexes into indexMap, noMeta, and unreadable buckets.
Purge mode implementations
cmd/purge.go
purgeAll logs all hash dirs before removing the data dir; purgeProjects uses the new scan shape; purgeMissing deletes only index hash dirs whose recorded project_path no longer exists (with --dry-run preview); purgeLegacy removes legacy metadata-less and unreadable dirs.
Tests and test helpers
cmd/purge_test.go
Adds seeding helpers for metadata-backed, legacy (no project_path), and corrupt indexes; updates runPurgeCmd to set flags by name; adds table-driven tests covering flag validation, --all, --missing, --dry-run, and --legacy behaviors and filesystem effects.
User documentation
README.md
Updates the "Database location" section to document lumen purge modes (--all, --missing, --dry-run) as safer alternatives to deleting the lumen directory.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • ory/lumen#135: Both PRs modify cmd/purge.go to rely on project_path metadata when scanning and removing index hash directories, with this PR adding mode flags and legacy handling support.
🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning The PR implements purge improvements but does not address the primary coding objective of issue #166: adding a list command to display indexed projects and models before purging. Address the core requirement from #166 by implementing a list command that shows indexed projects and models, or update the issue to reflect that the list feature will be implemented separately.
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: refactoring the purge command with a new default behavior (cwd-based) and introducing new flags (--all, --missing, --dry-run, --legacy).
Out of Scope Changes check ✅ Passed All changes are directly related to improving the purge command's functionality and safety—updating README documentation, refactoring flag handling, adding new modes (--all, --missing, --legacy, --dry-run), and expanding test coverage.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
cmd/purge_test.go (1)

268-342: ⚡ Quick win

Refactor these related flag-behavior tests into a table-driven test.

These cases are a good candidate for one table-driven test (name, flags, args, wantErr, wantErrContains, wantStderrContains, postCheck) to reduce duplication and make future mode additions safer.

As per coding guidelines, "Use table-driven tests for multiple test cases in Go".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/purge_test.go` around lines 268 - 342, Several closely related tests
(TestPurge_Missing_RemovesOnlyDeletedFolders,
TestPurge_Missing_AllFoldersExist_RemovesNothing,
TestPurge_Missing_WithPaths_Errors, TestPurge_DryRun_Missing_DeletesNothing,
TestPurge_DryRun_WithoutMissing_Errors) exercise the same purge flag behaviors
and should be consolidated into a single table-driven test; replace these
individual tests with one TestPurge_FlagBehavior that defines test cases with
fields like name, flags ([]string), args ([]string), wantErr (bool),
wantErrContains (string), wantStderrContains (string), and postCheck
(func(t,*testing.T)), and for each case call resolvedTempDir, seedIndex,
runPurgeCmd and assert expectations (use flagMissing and flagDryRun constants,
runPurgeCmd, and verify filesystem state via os.Stat inside postCheck) to cover
the same assertions as the originals while removing duplicated setup/teardown
code.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/purge_test.go`:
- Around line 308-315: Add a new unit test (or extend existing tests) to assert
that using the mutually-exclusive flags `--all` and `--missing` produces an
error: create a test similar to TestPurge_Missing_WithPaths_Errors that calls
runPurgeCmd with both flagAll and flagMissing (e.g., []string{flagAll,
flagMissing} or the equivalent string flags) and require.Error on the result,
then assert the error message mentions the conflict (e.g., contains "--all" and
"--missing"); also add the same assertion for the other location noted (the
block around the Test at lines 335-342) to cover both cases.

In `@cmd/purge.go`:
- Around line 141-145: The call to scanIndexes in purge.go may return an error
that is currently ignored, causing --all to proceed to delete the whole dataDir
without performing the promised per-index logging; update the code around
scanIndexes(dataDir) to capture the returned error (e.g., indexMap, legacy, err
:= scanIndexes(dataDir)), check err immediately, and return or propagate that
error before performing any irreversible delete operations so the command fails
safely when the pre-delete scan fails.
- Around line 235-238: The current logic treats any non-empty error from
store.ReadMetaAt as a missing metadata case and appends hashDir to legacy;
change the condition so only the “found but empty” case (err == nil && stored ==
"") appends to legacy, and handle err != nil separately: when store.ReadMetaAt
returns an error, wrap/return that error (or increment/log a distinct failure
counter) instead of classifying it as legacy; update the block around
store.ReadMetaAt(...) in cmd/purge.go to check err first and use the explicit
stored=="" branch for legacy so real DB/read failures (from ReadMetaAt) are not
silently skipped.

---

Nitpick comments:
In `@cmd/purge_test.go`:
- Around line 268-342: Several closely related tests
(TestPurge_Missing_RemovesOnlyDeletedFolders,
TestPurge_Missing_AllFoldersExist_RemovesNothing,
TestPurge_Missing_WithPaths_Errors, TestPurge_DryRun_Missing_DeletesNothing,
TestPurge_DryRun_WithoutMissing_Errors) exercise the same purge flag behaviors
and should be consolidated into a single table-driven test; replace these
individual tests with one TestPurge_FlagBehavior that defines test cases with
fields like name, flags ([]string), args ([]string), wantErr (bool),
wantErrContains (string), wantStderrContains (string), and postCheck
(func(t,*testing.T)), and for each case call resolvedTempDir, seedIndex,
runPurgeCmd and assert expectations (use flagMissing and flagDryRun constants,
runPurgeCmd, and verify filesystem state via os.Stat inside postCheck) to cover
the same assertions as the originals while removing duplicated setup/teardown
code.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: e66f5bca-1984-4341-b0e8-5cac3a01e823

📥 Commits

Reviewing files that changed from the base of the PR and between d0dee0e and 7d51809.

📒 Files selected for processing (3)
  • README.md
  • cmd/purge.go
  • cmd/purge_test.go

Comment thread cmd/purge_test.go Outdated
Comment thread cmd/purge.go
Comment thread cmd/purge.go Outdated
Add --legacy flag to remove only legacy indexes (created by older binaries without project_path metadata) plus unreadable/corrupt index directories.
Legacy indexes remain usable by the system (located by path hash) but are invisible to path- and --missing-based purge.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a list command so we can know what to purge

1 participant