feat(purge): rework index cleanup with cwd default and new flags by Ismael · Pull Request #172 · ory/lumen

Ismael · 2026-05-30T19:48:15Z

Reworks lumen purge to be safe-by-default and adds explicit modes for broader cleanup. Previously, lumen purge with no arguments wiped every index. Now no-args purges only the current project, and destructive/bulk operations are opt-in behind flags.

Invocation	Effect
lumen purge	Removes only the current working directory's index (path normalized to its git root).
lumen purge <path...>	Removes the index directories for the given projects (unchanged).
lumen purge --all	Removes every index, including legacy indexes that predate project_path metadata.
lumen purge --missing	Removes indexes whose recorded project folder no longer exists on disk. Deletes only when the folder is confirmed missing.
lumen purge --missing --dry-run	Previews what --missing would remove without deleting.

Mutually exclusive combinations are rejected with clear errors (e.g. --all + paths, --dry-run without --missing).

Notes

--missing is conservative: any stat error other than "not exist" skips the index rather than risking deletion, so a transient filesystem error never wipes a live index.
Each removed index is logged individually (with its project_path where known); --all also logs legacy dirs by path.
The data-dir scan (scanIndexes) is shared across all modes; it now also surfaces legacy (metadata-less) dirs so --all can log them, while path/--missing modes ignore them.
README and purge --help text updated to document all modes.

Closes #166

Summary by CodeRabbit

New Features
- Purge command adds --all, --missing, --legacy and --dry-run modes; default no-arg purge now targets only the current directory.
- Purge now reports which index directories will be removed before deletion.
Documentation
- README updated to document the new purge modes as safer alternatives to deleting the lumen directory.
Tests
- Expanded tests to cover flag validation, --missing/--legacy behaviors, and dry-run preview logging.

Change "lumen purge" with no arguments to remove only the current project's index (normalized to its git root) instead of wiping everything, and add explicit modes for the broader operations: --all Remove every index, including legacy indexes that predate project_path metadata. --missing Remove indexes whose recorded project folder no longer exists; only deletes when the folder is confirmed missing. --dry-run With --missing, preview removals without deleting. Each removed index is logged individually, mutually exclusive flag combinations are rejected, and the data-dir scan is shared across all modes. README and command help updated accordingly. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

coderabbitai · 2026-05-30T19:48:25Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 9dbce66f-7335-42de-a3d4-2655ca45e89f

📥 Commits

Reviewing files that changed from the base of the PR and between 7d51809 and a924d20.

📒 Files selected for processing (2)

cmd/purge.go
cmd/purge_test.go

🚧 Files skipped from review as they are similar to previous changes (1)

cmd/purge.go

📝 Walkthrough

Walkthrough

This PR expands the lumen purge command with three operational modes: --all purges every index, --missing removes indexes for deleted source folders, and --dry-run previews removals without deletion. The implementation includes flag definitions, validation, purge mode dispatch, refactored index scanning, comprehensive test coverage, and user documentation.

Changes

Purge Command Mode Expansion

Layer / File(s)	Summary
Flag definitions and command setup `cmd/purge.go`	Purge flag names are declared as constants, registered via `registerPurgeFlags`, and the purge command help text is extended to document `--all`, `--missing`, `--legacy`, and `--dry-run`.
Core purge infrastructure and scan refactor `cmd/purge.go`	`runPurge` is refactored to read/validate flags, default no-arg behavior to the current working directory, dispatch modes, and `scanIndexes` now classifies indexes into `indexMap`, `noMeta`, and `unreadable` buckets.
Purge mode implementations `cmd/purge.go`	`purgeAll` logs all hash dirs before removing the data dir; `purgeProjects` uses the new scan shape; `purgeMissing` deletes only index hash dirs whose recorded `project_path` no longer exists (with `--dry-run` preview); `purgeLegacy` removes legacy metadata-less and unreadable dirs.
Tests and test helpers `cmd/purge_test.go`	Adds seeding helpers for metadata-backed, legacy (no `project_path`), and corrupt indexes; updates `runPurgeCmd` to set flags by name; adds table-driven tests covering flag validation, `--all`, `--missing`, `--dry-run`, and `--legacy` behaviors and filesystem effects.
User documentation `README.md`	Updates the "Database location" section to document `lumen purge` modes (`--all`, `--missing`, `--dry-run`) as safer alternatives to deleting the lumen directory.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

ory/lumen#135: Both PRs modify cmd/purge.go to rely on project_path metadata when scanning and removing index hash directories, with this PR adding mode flags and legacy handling support.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Linked Issues check	⚠️ Warning	The PR implements purge improvements but does not address the primary coding objective of issue `#166`: adding a list command to display indexed projects and models before purging.	Address the core requirement from `#166` by implementing a list command that shows indexed projects and models, or update the issue to reflect that the list feature will be implemented separately.
Docstring Coverage	⚠️ Warning	Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: refactoring the purge command with a new default behavior (cwd-based) and introducing new flags (--all, --missing, --dry-run, --legacy).
Out of Scope Changes check	✅ Passed	All changes are directly related to improving the purge command's functionality and safety—updating README documentation, refactoring flag handling, adding new modes (--all, --missing, --legacy, --dry-run), and expanding test coverage.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

cmd/purge_test.go (1)

268-342: ⚡ Quick win

Refactor these related flag-behavior tests into a table-driven test.

These cases are a good candidate for one table-driven test (name, flags, args, wantErr, wantErrContains, wantStderrContains, postCheck) to reduce duplication and make future mode additions safer.

As per coding guidelines, "Use table-driven tests for multiple test cases in Go".

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/purge_test.go` around lines 268 - 342, Several closely related tests
(TestPurge_Missing_RemovesOnlyDeletedFolders,
TestPurge_Missing_AllFoldersExist_RemovesNothing,
TestPurge_Missing_WithPaths_Errors, TestPurge_DryRun_Missing_DeletesNothing,
TestPurge_DryRun_WithoutMissing_Errors) exercise the same purge flag behaviors
and should be consolidated into a single table-driven test; replace these
individual tests with one TestPurge_FlagBehavior that defines test cases with
fields like name, flags ([]string), args ([]string), wantErr (bool),
wantErrContains (string), wantStderrContains (string), and postCheck
(func(t,*testing.T)), and for each case call resolvedTempDir, seedIndex,
runPurgeCmd and assert expectations (use flagMissing and flagDryRun constants,
runPurgeCmd, and verify filesystem state via os.Stat inside postCheck) to cover
the same assertions as the originals while removing duplicated setup/teardown
code.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/purge_test.go`:
- Around line 308-315: Add a new unit test (or extend existing tests) to assert
that using the mutually-exclusive flags `--all` and `--missing` produces an
error: create a test similar to TestPurge_Missing_WithPaths_Errors that calls
runPurgeCmd with both flagAll and flagMissing (e.g., []string{flagAll,
flagMissing} or the equivalent string flags) and require.Error on the result,
then assert the error message mentions the conflict (e.g., contains "--all" and
"--missing"); also add the same assertion for the other location noted (the
block around the Test at lines 335-342) to cover both cases.

In `@cmd/purge.go`:
- Around line 141-145: The call to scanIndexes in purge.go may return an error
that is currently ignored, causing --all to proceed to delete the whole dataDir
without performing the promised per-index logging; update the code around
scanIndexes(dataDir) to capture the returned error (e.g., indexMap, legacy, err
:= scanIndexes(dataDir)), check err immediately, and return or propagate that
error before performing any irreversible delete operations so the command fails
safely when the pre-delete scan fails.
- Around line 235-238: The current logic treats any non-empty error from
store.ReadMetaAt as a missing metadata case and appends hashDir to legacy;
change the condition so only the “found but empty” case (err == nil && stored ==
"") appends to legacy, and handle err != nil separately: when store.ReadMetaAt
returns an error, wrap/return that error (or increment/log a distinct failure
counter) instead of classifying it as legacy; update the block around
store.ReadMetaAt(...) in cmd/purge.go to check err first and use the explicit
stored=="" branch for legacy so real DB/read failures (from ReadMetaAt) are not
silently skipped.

---

Nitpick comments:
In `@cmd/purge_test.go`:
- Around line 268-342: Several closely related tests
(TestPurge_Missing_RemovesOnlyDeletedFolders,
TestPurge_Missing_AllFoldersExist_RemovesNothing,
TestPurge_Missing_WithPaths_Errors, TestPurge_DryRun_Missing_DeletesNothing,
TestPurge_DryRun_WithoutMissing_Errors) exercise the same purge flag behaviors
and should be consolidated into a single table-driven test; replace these
individual tests with one TestPurge_FlagBehavior that defines test cases with
fields like name, flags ([]string), args ([]string), wantErr (bool),
wantErrContains (string), wantStderrContains (string), and postCheck
(func(t,*testing.T)), and for each case call resolvedTempDir, seedIndex,
runPurgeCmd and assert expectations (use flagMissing and flagDryRun constants,
runPurgeCmd, and verify filesystem state via os.Stat inside postCheck) to cover
the same assertions as the originals while removing duplicated setup/teardown
code.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: e66f5bca-1984-4341-b0e8-5cac3a01e823

📥 Commits

Reviewing files that changed from the base of the PR and between d0dee0e and 7d51809.

📒 Files selected for processing (3)

README.md
cmd/purge.go
cmd/purge_test.go

Add --legacy flag to remove only legacy indexes (created by older binaries without project_path metadata) plus unreadable/corrupt index directories. Legacy indexes remain usable by the system (located by path hash) but are invisible to path- and --missing-based purge.

coderabbitai Bot reviewed May 30, 2026

View reviewed changes

Comment thread cmd/purge_test.go Outdated

Comment thread cmd/purge.go

Comment thread cmd/purge.go Outdated

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(purge): rework index cleanup with cwd default and new flags#172

feat(purge): rework index cleanup with cwd default and new flags#172
Ismael wants to merge 2 commits into
ory:mainfrom
Ismael:feat-purge-rework

Ismael commented May 30, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 30, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Ismael commented May 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Ismael commented May 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 30, 2026 •

edited

Loading