feat: add Codebase Intelligence — repo map with PageRank-ranked structural summaries#543
feat: add Codebase Intelligence — repo map with PageRank-ranked structural summaries#543gnanam1990 wants to merge 3 commits intomainfrom
Conversation
Vasanthdev2004
left a comment
There was a problem hiding this comment.
Thanks, this is a genuinely interesting addition, and I like that you kept the auto-injected path behind a flag while also adding a direct /repomap command and tool surface. The tests and green CI help here too.
I do still see one blocker on the current head though:
- The PR documents and implies that repo-map auto-injection can be enabled with
REPO_MAP=1 openclaude, but the actual gating in the open build is still compile-time viafeature('REPO_MAP')frombun:bundle, andscripts/build.tsstill hardcodesREPO_MAP: false. On the current head, setting the runtime env var alone will not makegetRepoMapContext()start injecting anything into session context. So right now the user-facing docs and the shipped behavior disagree.
Concretely, the current surface looks like this:
src/context.tsonly enables auto-injection whenfeature('REPO_MAP')is truescripts/build.tsstill setsREPO_MAP: falsedocs/repo-map.mdtells users to enable it with a runtime env var
I think this needs one of two fixes before approval:
- either wire the feature so the documented runtime enablement actually works in the open build, or
- narrow the docs and PR messaging so they clearly say only
/repomapand the tool are available for now, and that auto-injection is not user-enableable in the current open build yet.
Once that mismatch is fixed on the current head, I’m happy to re-review.
|
@Vasanthdev2004 Good catch — you're right, the docs and the actual gate disagreed. Fixed in 5919dde.
const runtimeEnabled = isEnvTruthy(process.env.REPO_MAP)
if (!feature('REPO_MAP') && !runtimeEnabled) return null
|
5919dde to
43886cc
Compare
Vasanthdev2004
left a comment
There was a problem hiding this comment.
Thanks for the follow-up here. I rechecked the current head 43886ccbf8ab1605ea8d948e5218bd7c5af386e9 against the actual GitHub PR surface, the latest commits, the earlier review thread, and the current check state.
This is a targeted re-review of the earlier blocker around the repo-map enablement path.
What I rechecked:
src/context.tsnow enables repo-map auto-injection when either the compile-timefeature('REPO_MAP')flag is on or the runtimeREPO_MAPenv var is truthyscripts/build.tsstill keeps the compile-time default off (REPO_MAP: false)docs/repo-map.mdnow matches the shipped open-build behavior for runtime enablement- current checks are green on this head
That fixes the blocker I raised earlier. The documented REPO_MAP=1 openclaude path now actually matches the gate in the open build, instead of being a no-op.
Verdict: Approve-ready
I do not see a remaining blocker on the current head.
|
hello bro @gnanam1990 please fix conflicts when you can so we can merge this |
Vasanthdev2004
left a comment
There was a problem hiding this comment.
Thanks for the follow-up. I rechecked the current head cf32497730ace65028d897e91ef4638a3f582306 against the earlier review state, the commits added since the earlier repo-map re-review, the current PR surface, and the current check status.
This is a targeted re-review of the stale head, not a fresh full review of the entire PR.
Verdict: Needs changes
Blocking issue:
src/tools/WebSearchTool/providers/duckduckgo.ts— since the earlier repo-map re-review, the branch has picked up a new commit that changes DuckDuckGo web-search error handling. That is outside the stated scope of this PR, which is still framed as the repo-map feature, and it touches a higher-scrutiny network/tool-behavior surface. I do not want to re-approve the current head under the earlier repo-map-only approval context while that unrelated change is bundled in here.
Non-blocking notes:
- The earlier repo-map blocker still appears fixed on the current head.
- Current GitHub checks are green.
- If the DDG change is split out or dropped from this branch, I would expect the repo-map PR to be back in approve-ready shape from my side.
…tural summaries Add a new module that builds a structural map of the repository by parsing source files with tree-sitter, building a cross-file reference graph weighted by IDF, ranking files with PageRank, and rendering a token-budgeted summary of the most important files and their signatures. Stage 1 — Core module (src/context/repoMap/): Symbol extraction via web-tree-sitter WASM, IDF-weighted reference graph via graphology, PageRank ranking, token-budgeted rendering via js-tiktoken cl100k_base, disk cache with mtime invalidation. Supports TypeScript, JavaScript, and Python. 10 tests. Stage 2 — RepoMap tool (src/tools/RepoMapTool/): buildTool wrapper registered in src/tools.ts. Read-only, concurrency-safe. Supports focus_files, focus_symbols, and max_tokens parameters. 9 tests. Stage 3 — Integration: Auto-injection into session context behind REPO_MAP feature flag (off by default). /repomap slash command with --tokens, --focus, --stats, and --invalidate flags. User-facing docs in docs/repo-map.md. 13 tests. With the flag off, the system context is byte-identical to previous behavior. Dependencies: web-tree-sitter, tree-sitter-wasms, graphology, graphology-pagerank, graphology-operators, js-tiktoken Tests: 32 new, 621 total passing, 0 failures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses review feedback from @Vasanthdev2004: the docs advertised REPO_MAP=1 openclaude as the enablement path, but the gate in getRepoMapContext only checked feature('REPO_MAP'), which is compile-time and hardcoded to false in the open build. The env var was effectively a no-op. Now getRepoMapContext enables auto-injection when EITHER the compile-time flag is true OR the runtime env var REPO_MAP is truthy. This makes the documented enablement path actually work without requiring users to edit scripts/build.ts and rebuild. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cf32497 to
8c8ec7c
Compare
|
@Vasanthdev2004 ready for re-review. Addressed your April 24 feedback by rebasing the branch onto current main:
Verified locally on Linux:
The runtime |
|
@Vasanthdev2004 Thanks for the re-review. Quick status update on the stale-head concern: The DDG change is no longer in this PR's diff. Current head is
CI is green. Could you take another look when you have a moment? |
Vasanthdev2004
left a comment
There was a problem hiding this comment.
Thanks for the rebase and cleanup. I rechecked current head 8c8ec7c.
Scope: Targeted re-review of the latest head after the stale DuckDuckGo concern.
Verdict: Needs changes
Good news: the earlier stale-head blocker is cleared. The current PR file list is repo-map-only, I do not see any WebSearchTool / DuckDuckGo paths in the diff, and GitHub checks are green.
Blocking issue:
- The repo-map feature appears to work from the source checkout, but it will not work correctly from the published npm package because the runtime tree-sitter query files are not shipped.
src/context/repoMap/parser.tsreadssrc/context/repoMap/queries/*-tags.scmat runtime, butpackage.jsononly publishesbin/,dist/cli.mjs, andREADME.md. I verified this withnpm pack --dry-run; the tarball contents are onlyLICENSE,README.md,bin/import-specifier.mjs,bin/import-specifier.test.mjs,bin/openclaude,dist/cli.mjs, andpackage.json. So afternpm install -g @gitlawb/openclaude,/repomapand auto-injection would not have the query files available and symbol extraction would silently return empty results.
What I checked:
gh pr view 543current head:8c8ec7cgh pr diff 543 --name-only: no WebSearchTool / DuckDuckGo filesbun install --frozen-lockfilebun test ./src/tools/RepoMapTool ./src/context/repoMap ./src/commands/repomap ./src/context.repoMap.test.tspassed 32/32bun run buildpassednpm pack --dry-runconfirmed the query assets are not included in the package
Suggested fix: either embed the .scm query text into the bundle, or ship the query files in the package and resolve them from a packaged path. Please also add a small packaging test/check so this does not regress.
Non-blocking note:
- The
/repomapdocs say the default is 1024 tokens, butparseArgs()defaults to 2048 and the test name says 1024 while asserting 2048. Worth aligning while touching this area, but the packaging issue above is the blocker.
|
Superseded by #966. Re-opened on a fresh branch (
Closing this in favor of #966 to keep the review thread clean. Thanks @Vasanthdev2004 for the careful catch on the package contents — that was a real bug that would have shipped a no-op |
Summary
RepoMaptool the model can call on-demand during sessions, with support forfocus_filesandfocus_symbolsto narrow the ranking/repomapslash command for users to inspect, tune, and invalidate the mapREPO_MAPfeature flag (off by default)What's included
src/context/repoMap/(12 files)src/context/repoMap/queries/(3.scmfiles)src/context/repoMap/__fixtures__/mini-repo/(5 files)src/tools/RepoMapTool/(4 files)buildToolwrapper registered insrc/tools.ts, read-only, concurrency-safesrc/commands/repomap/(3 files)/repomap,--tokens,--focus,--stats,--invalidatesrc/context.tsgetRepoMapContext()memoized, gated behindfeature('REPO_MAP')scripts/build.tsREPO_MAP: false— off by defaultdocs/repo-map.md,README.mdHow it works
Files imported by many others rank highest. Common symbol names (get, set, map, value) are down-weighted via IDF. Results are cached to disk keyed by
(path, mtime, size)— only changed files are re-parsed.Configuration
Supported languages
TypeScript, JavaScript, Python. Additional grammars in a follow-up.
Dependencies added
web-tree-sitter,tree-sitter-wasms,graphology,graphology-pagerank,graphology-operators,js-tiktoken(~80MB in node_modules)Test plan
bun install— cleanbun test— 621 pass, 0 fail (32 new tests)bun run build— successbun run smoke— 0.1.8 (Open Claude)/repomap,--tokens,--focus,--stats,--invalidateKnown limitations
trueafter internal validation