Skip to content

feat(core): broaden .understandignore starter — C++ test/benchmark patterns#480

Open
thejesh23 wants to merge 1 commit into
Egonex-AI:mainfrom
thejesh23:feat/understandignore-cpp-test-patterns
Open

feat(core): broaden .understandignore starter — C++ test/benchmark patterns#480
thejesh23 wants to merge 1 commit into
Egonex-AI:mainfrom
thejesh23:feat/understandignore-cpp-test-patterns

Conversation

@thejesh23

@thejesh23 thejesh23 commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Summary

  • EXACT_DIR_NAMES += unittest, bench, benchmark, benchmarks
  • New TEST_PATTERN_GROUPS entry C++ with 10 patterns (gtest snake_case, folly/LLVM PascalCase, Chromium _unittest / _browsertest, co-located benchmarks)
  • +11 tests in ignore-generator.test.ts; build + lint clean (758/758)

SKILL.md unchanged — Phase 0.5 delegates to generate-ignore.mjs since a0155c5, so no inline duplicate to keep in sync.

Why these patterns (and not others)

Surveyed 18 major public C++ projects (LLVM, abseil, folly, protobuf, grpc, Chromium, bitcoin, mongo, godot, opencv, nlohmann/json, Catch2, doctest, …). Rejected as too project-specific: unit-*.cpp (nlohmann-only), perf/ (OpenCV-only, name too generic), *_perftest.cc (Chromium-only), gtest/ (collides with vendored-framework path). The interleaved-tests case (abseil, Chromium, protobuf) is the load-bearing argument for file-pattern lines over dir-only rules.

Token impact (measured on 7 public C++ projects)

Methodology: gh api .../git/trees?recursive=1, C/C++ source = .cc .cpp .cxx .c .h .hpp, 1 MiB ≈ 262K tokens.

Project Before (analyzed) After (analyzed) Reduction
abseil/abseil-cpp 9.68 MB / ~2.54M tok 5.57 MB / ~1.46M tok −1.08M (43%)
protocolbuffers/protobuf 22.04 MB / ~5.78M tok 18.46 MB / ~4.84M tok −0.94M (16%)
grpc/grpc 44.72 MB / ~11.72M tok 32.25 MB / ~8.46M tok −3.27M (28%)
nlohmann/json 5.38 MB / ~1.41M tok 2.07 MB / ~0.54M tok −0.87M (62%)
bitcoin/bitcoin 18.80 MB / ~4.93M tok 14.10 MB / ~3.70M tok −1.23M (25%)
facebook/folly 20.68 MB / ~5.42M tok 11.18 MB / ~2.93M tok −2.49M (46%)
tensorflow/tensorflow 172.42 MB / ~45.20M tok 118.50 MB / ~31.06M tok −14.13M (31%)
Weighted total 293.72 MB / ~77.00M tok 202.13 MB / ~52.99M tok −24.01M (31%)

Per-project median ~31% reduction (range 16–62%); volume-weighted across the sample ~31%. Largest absolute saving on tensorflow (−14.1M tokens) due to pervasive *_test.cc companions throughout its source tree.

Test plan

  • pnpm --filter @understand-anything/core build — clean
  • pnpm --filter @understand-anything/core test — 758/758 pass (38 in ignore-generator.test.ts)
  • pnpm lint — clean
  • Manual preview of generated .understandignore on a fixture with tests/, benchmarks/, unittest/, bench/ dirs — output as expected

Refs #479

🤖 Generated with Claude Code

…tterns

Add four directory names (unittest, bench, benchmark, benchmarks) and a
new C++ test-file pattern group covering gtest snake_case (*_test.cc),
folly/LLVM PascalCase (*Test.cpp), Chromium *_unittest.cc / *_browsertest.cc,
and co-located benchmarks (*_benchmark.cc, *Benchmark.cpp).

The interleaved-tests case (abseil, Chromium, protobuf) needs file-pattern
exclusions, not just dir rules — measured ~16-46% additional token reduction
on those repos vs. directory-only ignores. Median ~35% across 6 surveyed
projects (abseil, protobuf, grpc, nlohmann/json, bitcoin, folly).

All suggestions stay commented-out — same opt-in model as f5682bf. SKILL.md
needs no update; Phase 0.5 already delegates to generate-ignore.mjs (a0155c5).

Refs Egonex-AI#479

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant