Skip to content

feat: auto-detect and persist project conventions to wiki#1010

Open
kevincodex1 wants to merge 1 commit intoGitlawb:mainfrom
kevincodex1:feat/project-convention-scanner
Open

feat: auto-detect and persist project conventions to wiki#1010
kevincodex1 wants to merge 1 commit intoGitlawb:mainfrom
kevincodex1:feat/project-convention-scanner

Conversation

@kevincodex1
Copy link
Copy Markdown
Contributor

Summary

Adds a convention scanner that reads project config files (package.json, tsconfig.json, eslint, prettier, Dockerfile, CI workflows, lockfiles) on startup and saves extracted conventions to .openclaude/wiki/pages/conventions.md. Includes a fingerprint cache to avoid redundant writes and a /wiki scan command for manual re-scans.

New modules:

  • src/services/wiki/conventions.ts — scanner + cache + save
  • src/services/wiki/identity.ts — project identity (name, languages, monorepo)
  • src/services/wiki/conventions.test.ts — 7 tests

Modified:

  • paths/types/init/status — extended wiki infrastructure for conventions
  • wiki.tsx/index — added /wiki scan command
  • main.tsx — fires scan via startDeferredPrefetches
  • init.test.ts/status.test.ts — updated for new conventions page/fields

Impact

  • user-facing impact: After running /wiki init in a project, the agent now automatically knows the project's package manager, test framework, build/lint commands, TypeScript config, lint/format rules, and CI
    setup — no need to tell it every session. A new /wiki scan command lets users force a re-scan at any time. The /wiki status output now shows convention page status.
  • developer/maintainer impact: 3 new files (conventions.ts, identity.ts, conventions.test.ts), 7 modified. The scanner is extensible — adding a new config file type means adding one entry to the SCANNERS array.
    Fingerprint-based caching prevents redundant writes on subsequent sessions.

Testing

  • bun run build — compiles cleanly
  • bun run smoke — verified (build + version check passes)
  • focused tests: bun test src/services/wiki/ — 11 pass, 0 fail (53 expect() calls)
  • Typecheck: bun run typecheck — zero errors in services/wiki or commands/wiki modules

Notes

  • provider/model path tested: N/A (no provider/model changes)
  • screenshots attached (if UI changed): N/A (CLI-only, no UI changes)
  • follow-up work or known limitations:
    • Scanner runs after REPL render (via startDeferredPrefetches), so conventions aren't available on the very first turn of a fresh session. Subsequent sessions reuse the cached fingerprint.
    • Currently reads top-level config files only — doesn't recurse into monorepo packages (future work)
    • Wiki must be initialized with /wiki init before scanning starts (no auto-init)
    • Language detection is limited to the project root directory (no recursive file counting)

Adds a convention scanner that reads project config files (package.json,
tsconfig.json, eslint, prettier, Dockerfile, CI workflows, lockfiles) on
startup and saves extracted conventions to .openclaude/wiki/pages/conventions.md.
Includes a fingerprint cache to avoid redundant writes and a /wiki scan
command for manual re-scans.

New modules:
- src/services/wiki/conventions.ts — scanner + cache + save
- src/services/wiki/identity.ts — project identity (name, languages, monorepo)
- src/services/wiki/conventions.test.ts — 7 tests

Modified:
- paths/types/init/status — extended wiki infrastructure for conventions
- wiki.tsx/index — added /wiki scan command
- main.tsx — fires scan via startDeferredPrefetches
- init.test.ts/status.test.ts — updated for new conventions page/fields

Co-Authored-By: OpenClaude <openclaude@gitlawb.com>
@kevincodex1
Copy link
Copy Markdown
Contributor Author

please have an independent review too @techbrewboss

@jatmn
Copy link
Copy Markdown
Collaborator

jatmn commented May 4, 2026

I found 3 issues in the wiki-conventions changes:

1. Startup scan bypasses the trust gate

  • src/main.tsx
  • src/services/wiki/identity.ts

The new startup convention scan calls Git before trust is established by running git symbolic-ref refs/remotes/origin/HEAD through getProjectIdentity(). That breaks the existing startup rule that Git must not run in untrusted repos before trust is accepted.

Suggested fix: only run the scan after trust is established, or avoid Git-dependent identity detection during startup.

2. Cache key can leave conventions.md stale

  • src/services/wiki/conventions.ts

The cache fingerprint only includes detected convention sections, but the rendered page also includes project identity fields like project name, languages, monorepo status, and default branch. If only those identity fields change, the scan is skipped and the wiki page stays stale.

Suggested fix: include identity inputs in the fingerprint, or hash the final rendered markdown.

3. /wiki scan can still fail before /wiki init

  • src/services/wiki/conventions.ts

forceScanConventions() ignores failure writing pages/conventions.md, but still unconditionally writes .openclaude/.conventions-cache.json. In a fresh project that throws ENOENT, so /wiki scan can crash before the wiki exists.

Suggested fix: skip the cache write when the wiki root is missing, create the directory first, or return a clear "run /wiki init first" message.

Copy link
Copy Markdown
Collaborator

@Vasanthdev2004 Vasanthdev2004 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. The direction is useful, and CI is green, but I found one blocker on the current head.

Review scope: Targeted review of the new wiki convention scanner command/startup path, cache behavior, and focused tests.

Verdict: Needs changes

Blocking issue:

  1. /wiki scan fails when the wiki has not been initialized yet. forceScanConventions() catches the missing pages/conventions.md write, but then still calls writeCache(), which tries to write .openclaude/.conventions-cache.json even when .openclaude does not exist. I reproduced this locally with a temp project that only had package.json; forceScanConventions(cwd) throws ENOENT: no such file or directory, open '<tmp>\.openclaude\.conventions-cache.json'.

Why this matters: /wiki scan is now a user-facing command, so the uninitialized path should either return a clear ?run /wiki init first? message or avoid writing the cache when the wiki root is missing. Right now it can crash instead of producing a clean command result.

What I checked:

  • Current head: 51d6caf
  • bun test src/services/wiki/ passes
  • bun run build passes
  • Manual repro for uninitialized forceScanConventions(cwd) fails with the cache write ENOENT above

Happy to re-review once that uninitialized scan path is handled and covered by a small regression test.

Copy link
Copy Markdown
Collaborator

@techbrewboss techbrewboss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I did an independent pass over the convention scanner startup path, cache behavior, and the wiki command flow. I agree with the existing uninitialized /wiki scan blocker, and I found two additional issues that should be addressed before merge.

  1. Startup scan still runs Git before workspace trust is established. startDeferredPrefetches() now calls scanAndSaveConventions(getCwd()) unconditionally, and that reaches getProjectIdentity() -> detectMainBranch() -> execFileSync('git', ['symbolic-ref', ...]). This bypasses the explicit prefetchSystemContextIfSafe() trust gate immediately above it, which exists because Git can execute repo-controlled hooks/config. Please either gate this startup scan behind the same trust/non-interactive check, or make the startup path avoid Git-dependent identity detection until after trust.

  2. The cache fingerprint can leave conventions.md stale. computeFingerprint() only hashes the detected convention sections, but the rendered markdown also includes identity-derived data: project name, language counts, monorepo status, and default branch. If any of those identity fields changes while the scanned config sections do not, scanAndSaveConventions() returns null and skips rewriting the page. Please include the identity inputs in the cache key, or compute the cache key from the final markdown after normalizing/removing the scan timestamp.

I also confirmed bun test src/services/wiki/conventions.test.ts passes on this head, but the current tests do not cover the trust-gated startup path, identity-only cache invalidation, or the uninitialized /wiki scan cache write failure.

Copy link
Copy Markdown
Collaborator

@gnanam1990 gnanam1990 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed at current head — the three blockers raised by Vasanthdev2004 and techbrewboss are still present:

  1. forceScanConventions(cwd) (src/services/wiki/conventions.ts) catches the page write failure but unconditionally calls writeCache(cwd, scan.fingerprint) immediately after, which writes .openclaude/.conventions-cache.json. If .openclaude doesn't exist, this still throws ENOENT — /wiki scan crashes for users who haven't run /wiki init.
  2. startDeferredPrefetches() in src/main.tsx (line ~422) calls scanAndSaveConventions(getCwd()) outside the existing prefetchSystemContextIfSafe() trust gate. That path reaches getProjectIdentity()detectMainBranch()execFileSync('git', …) in src/services/wiki/identity.ts:798, which can run repo-controlled git hooks/config on untrusted workspaces. Must be gated identically to the system-context prefetch above it.
  3. computeFingerprint() only hashes the scanned config sections — but the rendered markdown also embeds identity-derived data (project name, language counts, monorepo flag, default branch). Identity-only changes won't invalidate the cache, so conventions.md stays stale. Either include identity inputs in the cache key or hash the final rendered markdown (with timestamp normalized).

No red-flag rule hits otherwise. Verified locally: traced forceScanConventions and startDeferredPrefetches call paths in the diff at HEAD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants