Skip to content

perf+robustness: eight verified hot-path fixes for large graphs (tour-gen O(n²)→O(n), single-parse, louvain crash, …)#490

Open
DashBot-0001 wants to merge 5 commits into
Egonex-AI:mainfrom
DashBot-0001:opt/verified-wins
Open

perf+robustness: eight verified hot-path fixes for large graphs (tour-gen O(n²)→O(n), single-parse, louvain crash, …)#490
DashBot-0001 wants to merge 5 commits into
Egonex-AI:mainfrom
DashBot-0001:opt/verified-wins

Conversation

@DashBot-0001

@DashBot-0001 DashBot-0001 commented Jun 21, 2026

Copy link
Copy Markdown

What

Eight small, behavior-preserving fixes on hot paths, motivated by large-graph scaling (the 15k-file repo in #226, many-service monorepos, and the ~3k-node dashboard target). Each is verified against the project's existing tests (green before and after) and the perf/robustness-sensitive ones are benchmarked with byte-identical output.

# File Change Evidence
1 skills/understand/merge-batch-graphs.py Precompile the 24-alternative project-prefix regex once at import instead of rebuilding + re.escape-ing the pattern string on every node. test_merge_batch_graphs.py 69/69 pass; 19.5× faster, 0 parity mismatches over 50k node ids.
2 packages/core/src/analyzer/tour-generator.ts Replace O(n²) topoOrder.includes() (per code node) with a Set, and the O(n) queue.shift() Kahn dequeue with a head cursor. tour-generator.test.ts green; 81× faster at 20k nodes, byte-identical topo order.
3 packages/core/src/analyzer/layer-detector.ts Collapse two full passes over graph.nodes into one; path-less file nodes deferred to Core to preserve ordering + Map key-insertion order. layer-detector.test.ts green.
4 packages/core/src/embedding-search.ts Hoist the query vector's magnitude out of the per-node cosine loop (invariant across a search). Same arithmetic, same order → bit-identical scores. embedding-search.test.ts green.
5 packages/dashboard/src/utils/louvain.ts Replace Math.max(...Array.from(map.values())…) with a reducing loop — the spread throws RangeError: Maximum call stack size exceeded at the ~3k+ node target. byte-identical over 5000 fuzz + edge cases; old form throws at 1M ids, new doesn't.
6 core/.../tree-sitter-plugin.ts, registry.ts, types.ts, skills/understand/extract-structure.mjs extract-structure.mjs parsed every code file twice (analyzeFile then extractCallGraph). Add an optional analyzeFileFull() that parses once and runs both extractors on the same tree (caller falls back when absent). Also cache one reusable tree-sitter parser per language. tree-sitter-plugin/plugin-registry/parsers suites green before+after (132/132); against the real TS grammar analyzeFileFullanalyzeFile+extractCallGraph (0 mismatches) at ~39% less parse work/code file.
7 skills/understand/compute-batches.mjs buildNonCodeBatches re-filtered the full path list once per Dockerfile/migration dir — O(dirs·N). Hoist the path list once + index paths by dir; Group A/D lookups become O(1). byte-identical on the scan-result-non-code.json fixture + 300 fuzzed repos (0 mismatches); 17–31× faster on 300–600-service monorepos (4250 files: 105ms → 3.4ms).
8 packages/dashboard/src/utils/filters.ts getEdgeCategory linear-scanned every category's type array per edge in filterEdges. Build a reverse edgeType → category index once at module load; lookup is O(1). byte-identical over all known edge types + unknowns (0 mismatches).

Why

tour-generator's includes() is the clearest asymptotic trap on a 15k-file repo. #6 halves tree-sitter parse work for code files on the indexing hot path; #7 is the same shape for non-code files on many-service monorepos (the microservices-demo case). #5 removes a real crash vector at the dashboard's stated scale. The rest are free — same outputs, less work, all on paths that run on every /understand, every search, and every layer/tour build.

Safety

All eight preserve behavior exactly. #2/#4/#5/#6/#7 produce byte-identical output (verified by diffing old vs new over large/fuzzed inputs and against the real tree-sitter grammar); #1/#3 are covered by the existing suites. #6 is additive (analyzeFileFull is optional, caller falls back). No breaking API changes.

Related

  • Knowledge-graph view still blocks the main thread: synchronous d3-force layout (~4.3s at 3k nodes); layout.worker.ts is unused #491 (filed alongside this) — the knowledge-graph view still runs d3-force synchronously on the main thread (~4.3s block at 3k nodes), and dashboard/utils/layout.worker.ts is dead code. That's a bigger, behavior-changing fix (worker-ization + deterministic seeding) so it's tracked separately rather than bundled here.
  • skills/understand-domain/extract-domain-context.py — the gitignore glob→regex is unanchored (build matches mybuild/, over-ignoring); a correct fix needs proper segment-anchoring + a gitignore test suite, so it's left for a dedicated PR.

DashBot-0001 and others added 2 commits June 21, 2026 10:18
All four preserve behavior byte-for-byte and are [executed]-verified.

1. merge-batch-graphs.py — precompile the project-prefix regex once at
   import instead of rebuilding+re-escaping the 24-alternative pattern
   string on every node. The regex cache keyed on the final string, so the
   compile was cached but the join+re.escape work was not.
   Verified: 69/69 unittest pass; 19.5x faster, 0 parity mismatches over 50k ids.

2. tour-generator.ts — replace O(n^2) `topoOrder.includes()` (per code node)
   with a Set, and the O(n) `queue.shift()` Kahn dequeue with a head cursor.
   Verified: tour-generator.test.ts green before+after; 81x faster at 20k
   nodes with byte-identical output (matters for the 15k-file repo, Egonex-AI#226).

3. layer-detector.ts — collapse two full passes over graph.nodes into one,
   deferring path-less file nodes to Core to preserve original ordering and
   Map key-insertion order. Verified: layer-detector.test.ts green before+after.

4. embedding-search.ts — hoist the query vector's magnitude out of the
   per-node cosine loop (it is invariant across a search). Same arithmetic
   order → bit-identical scores. Verified: embedding-search.test.ts green
   before+after.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
detectCommunities reassigns -1 community sentinels to ids past the current
max via `Math.max(...Array.from(map.values()).filter(v => v >= 0), -1)`.
Spreading every community id as call arguments throws
`RangeError: Maximum call stack size exceeded` once the node count crosses
the engine's argument limit — reachable on the ~3k+ node graphs this
dashboard targets. Replace with a reducing loop: same result (verified
byte-for-byte over 5000 fuzz cases + edge cases), no spread, and it also
drops the throwaway filtered array allocation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@DashBot-0001 DashBot-0001 changed the title perf: four verified low-risk hot-path optimizations (tour-gen O(n²)→O(n), regex precompile, etc.) perf+robustness: five verified hot-path fixes for large graphs (tour-gen O(n²)→O(n), louvain crash, regex precompile) Jun 21, 2026
…rsers

`extract-structure.mjs` calls `analyzeFile` then `extractCallGraph` on every
code file (lines 100 + 109) — two full tree-sitter parses of identical
content, on the indexing hot path that runs on every `/understand`.

- Add `TreeSitterPlugin.analyzeFileFull()` (and an optional `AnalyzerPlugin`
  interface method + `PluginRegistry` delegation) that parses once and runs
  both extractors on the same rootNode. Both extractors are pure functions of
  the tree, so output is byte-identical to the two separate calls.
- `extract-structure.mjs` uses it when present and falls back to the original
  two-call path (preserving their independent error degradation) when the
  registry/plugin lacks it or it throws.
- `getParser` now caches one reusable parser per language instead of
  allocating `new Parser()` + `setLanguage` and `delete()`-ing it on every
  call. Trees are still deleted per parse.

Verified [executed]:
- Project's own suites green before+after: tree-sitter-plugin, plugin-registry,
  parsers, tour-generator, layer-detector, embedding-search — 132/132.
- Against the real tree-sitter TS grammar: analyzeFileFull.structure ≡
  analyzeFile and analyzeFileFull.callGraph ≡ extractCallGraph (0 mismatches
  over many files and repeated reuse of the cached parser); single-parse is
  ~39% less parse work per code file.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@DashBot-0001 DashBot-0001 changed the title perf+robustness: five verified hot-path fixes for large graphs (tour-gen O(n²)→O(n), louvain crash, regex precompile) perf+robustness: six verified hot-path fixes for large graphs (tour-gen O(n²)→O(n), single-parse, louvain crash, …) Jun 21, 2026
`buildNonCodeBatches` re-materialized `[...byPath.keys()]` seven times and,
for Group A (Dockerfile clusters) and Group D (SQL migrations), re-filtered
the full path list once per directory — O(dirs·N). On a many-service
monorepo (one Dockerfile per service) that was the dominant cost.

Hoist the path list once and build a dir→paths index once; Group A/D lookups
become O(1). Output is byte-for-byte identical (the path ordering, sorts and
group boundaries are all preserved).

Verified [executed]: byte-identical to the previous implementation on the
`scan-result-non-code.json` fixture and over 300 fuzzed non-code repos
(0 mismatches); 17–31× faster on 300–600-service synthetic monorepos
(e.g. 4250 files: 105ms → 3.4ms).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@DashBot-0001 DashBot-0001 changed the title perf+robustness: six verified hot-path fixes for large graphs (tour-gen O(n²)→O(n), single-parse, louvain crash, …) perf+robustness: seven verified hot-path fixes for large graphs (tour-gen O(n²)→O(n), single-parse, louvain crash, …) Jun 21, 2026
`getEdgeCategory` linear-scanned every category's type array (and re-ran
`Object.entries(EDGE_CATEGORY_MAP)`) for every edge, inside `filterEdges`.
Build a reverse `edgeType → category` index once at module load; lookup is
now O(1). First-category-wins order preserved.

Verified [executed]: byte-identical to the scan over all known edge types
plus unknowns (0 mismatches).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@DashBot-0001 DashBot-0001 changed the title perf+robustness: seven verified hot-path fixes for large graphs (tour-gen O(n²)→O(n), single-parse, louvain crash, …) perf+robustness: eight verified hot-path fixes for large graphs (tour-gen O(n²)→O(n), single-parse, louvain crash, …) Jun 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant