Skip to content

Feat/new parsing layer#20

Open
fredbi wants to merge 142 commits into
go-openapi:masterfrom
fredbi:feat/new-parsing-layer
Open

Feat/new parsing layer#20
fredbi wants to merge 142 commits into
go-openapi:masterfrom
fredbi:feat/new-parsing-layer

Conversation

@fredbi
Copy link
Copy Markdown
Member

@fredbi fredbi commented Apr 21, 2026

Change type

Please select: 🆕 New feature or enhancement|🔧 Bug fix'|📃 Documentation update

Short description

Fixes

Full description

Checklist

  • I have signed all my commits with my name and email (see DCO. This does not require a PGP-signed commit
  • I have rebased and squashed my work, so only one commit remains
  • I have added tests to cover my changes.
  • I have properly enriched go doc comments in code.
  • I have properly documented any breaking change.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
992 1 991 0
View the top 1 failed test(s) by shortest run time
github.com/go-openapi/codescan/internal/parsers/grammar/gen::TestGeneratedDocIsCurrent
Stack Traces | 0s run time
Failed

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

fredbi and others added 29 commits May 26, 2026 08:33
Place empty package files under internal/parsers/grammar/ as the
landing zone for P1 work: preprocess.go, lexer.go, parser.go, ast.go,
diagnostic.go, style.go. Each file carries a TODO pointer to the P1
task. No behavior; go build ./... remains clean.

See .claude/plans/grammar-parser-architecture.md and
.claude/plans/grammar-parser-tasks.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add keywords.go (types + functional-options constructor + Lookup/Keywords
accessors) and keywords_table.go (the authored []Keyword data: 35 entries
covering validations, property flags, meta single-line, block headers,
plus W5 externalDocs).

Design choices, per architecture §3.4 / §2.2.1 and tasks P0.2:
  - Kind enum lists the 8 sub-contexts where keywords may appear
    (Param, Header, Schema, Items, Route, Operation, Meta, Response).
  - ValueType covers the primitive-typed values the parser will
    convert in-line (Number/Integer/Boolean/StringEnum) plus the
    deferred categories (String verbatim, CommaList, RawBlock, RawValue).
  - Option A for docs: each keyword carries per-context doc strings
    (inParam("…"), inSchema("…"), …) so LSP can show tooltips that
    match where the cursor sits.
  - W5 opportunistic: externalDocs entry landed alongside v1 keywords.
  - W7 opportunistic: per-keyword legal-contexts list is exactly the
    seed data LSP completion will consult; seeded from observed v1
    behavior (regexprs.go + tagger trees).

Drop inResponse() and doc() options — no v1 keyword uses them; we
re-add if W6/W2 surface the need.

Add keywords_test.go covering Lookup (canonical, alias, case/space
normalization, unknown) and a shape invariant (every keyword has a
name, ≥1 context, and StringEnum implies Values).

Also add missing SPDX headers to the P0.1 placeholder files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add Severity (Error/Warning/Hint), Code (dotted stable identifier), and
Diagnostic{Pos, Severity, Code, Message}. Codes prefixed "parse." for
the grammar layer; sub-parser subpackages use their own prefix so codes
stay globally unique.

Pre-declare 10 codes the parser and its analyzers will emit in P1/P2
(invalid-number/integer/boolean/string-enum, unknown-keyword,
context-invalid, invalid-extension-name, unterminated-yaml,
invalid-annotation, malformed-line). The list grows as sites surface.

Expose Errorf/Warnf/Hintf constructors (formatted Message) and a
compiler-style Diagnostic.String() rendering so editor jump-to-line
tooling can consume the output directly.

Tests cover Severity.String, each constructor, the render format, and
the empty-position fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add internal/parsers/grammar/gen — a small command that reads the
authoritative Keywords() slice at `go generate` time and renders it to
docs/annotation-keywords.md (summary table + per-keyword details with
aliases, value type, legal contexts, per-context docs).

  - //go:generate directive lives in keywords_table.go, runs
    `go run ./gen -out ../../../docs/annotation-keywords.md`.
  - Output is deterministic (same input -> byte-identical file), which
    P0.5 will enforce in CI.
  - Named constants for exit codes and perms avoid mnd lint flags; no
    nolint directives needed.

Supporting additions:
  - Kind.String() and ValueType.String() — labels used by the generator
    (and later by P1.7 context-invalid diagnostic messages).
  - `exhaustive` lint satisfied by explicit `KindUnknown`/`ValueNone`
    cases with `fallthrough` to default.

Generated docs/annotation-keywords.md covers the 34 v1-parity keywords,
rendered at 326 lines.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Expose the generator's Render() function and add gen_test.go with
TestGeneratedDocIsCurrent: reads the committed
docs/annotation-keywords.md and compares it byte-for-byte against a
fresh render of the current keyword table. If the table changes and
the doc isn't regenerated, CI fails with an actionable message
telling the developer to run go generate and commit.

Rationale for putting the check in a _test.go rather than a dedicated
workflow: this repo delegates CI plumbing to go-openapi/ci-workflows'
shared workflows. Adding a bespoke workflow just for this check would
break that pattern for a one-line assertion. A unit test runs as part
of the existing go-test job at zero ceremony.

Completes P0. Next: P1 core parser pipeline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add Preprocess(cg, fset) -> []Line{Text, Pos}. Handles both // and
/* */ forms, including multi-line blocks with continuation asterisks.
Leading godoc decoration (whitespace, *, /, -, optional markdown
table pipe) is stripped via trimContentPrefix, mirroring the v1
rxUncommentHeaders regex so fixtures stay parity-compatible at the
parse-output level.

Position tracking: each Line carries the token.Position of the first
character of Text. Continuation lines inside a /* ... */ block report
Column=1 for simplicity; precise column reconstruction would require
re-tokenising and is deferred until LSP needs it.

Fence-body indentation is not preserved here — fence state lives at
the lexer layer (P1.2), so the preprocessor stays stateless and
position-only. Documented in the godoc.

Tests cover:
  - nil CommentGroup / FileSet returns nil
  - single-line and multi-line // comments
  - /* ... */ blocks with leading '*' decorations
  - markdown table-pipe stripping (and whitespace after the pipe)
  - embedded whitespace preserved inside Text
  - multiple *ast.Comment entries in one CommentGroup

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Tighten stripComment so Line.Pos.Line, .Column, and .Offset are all
accurate on every emitted line, including continuation lines inside
/* ... */ blocks. Previously the column was approximated to 1 on
block-comment continuation lines, which would have forced a
retrofit when LSP consumed Line.Pos.

Factor the per-line math into stripLine(s, pos) which advances pos
by the number of bytes trimContentPrefix consumed. Block-comment
paths compute each line's starting Offset by tracking the byte index
of the current line within the comment body and adding the "/*"
marker length.

Add column-precision tests:
  - // line comment: `foo` sits at Column=4 ("//", space, f)
  - /* block comment with " * prefix": content after " * " at col 4
  - indented block with tab continuation: content at col 4
  - offset monotonicity across a multi-line // group

Minor: extract `wantModelFoo` const in the test file to satisfy
goconst on the now-reused fixture string.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add TokenKind enum (EOF, BLANK, TEXT, ANNOTATION, KEYWORD_VALUE,
KEYWORD_BLOCK_HEAD, YAML_FENCE) and Token struct with per-kind fields
(Text, Value, Keyword, Args, ItemsDepth). Lex() emits one token per
preprocessed line plus a trailing TokenEOF.

Classification rules (per line):
  - empty or whitespace-only          -> BLANK
  - "---" (trim-equal)                -> YAML_FENCE
  - starts with "swagger:"            -> ANNOTATION (Text=name, Args=rest)
  - "[items.]*<keyword>: <value>"     -> KEYWORD_VALUE  (Value populated)
  - "[items.]*<keyword>:"             -> KEYWORD_BLOCK_HEAD
  - otherwise                         -> TEXT

Keyword recognition goes through grammar.Lookup, which already
handles case-insensitivity and aliases — canonical name is written
into Token.Text and the matching *Keyword is attached.

stripItemsPrefix mirrors rxItemsPrefixFmt: `(?:[Ii]tems[.\s]*)+`.
It does NOT overeat: "maxItems" stays a single keyword (prefix check
is anchored at position 0 of the token text, not sub-matched); and
bare "items:" stays a non-keyword TEXT since nothing in the table
matches "items" alone.

Position tracking: Token.Pos is advanced past any stripped items.
prefix so it points at the keyword's first character.

Godoc-identifier-prefix form ("DoFoo swagger:route ...") is NOT
handled in the lexer — deferred to P1.4 where the parser orchestrates
annotation discovery and can decide case-by-case.

Tests cover each token kind, each items-prefix depth variant, the
"maxItems must not overeat" edge, canonical-name resolution from
aliases (MAX -> maximum, max-length -> maxLength), malformed
"swagger:" falling back to TEXT, and Pos advancement after items.
prefix stripping.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add Block interface and the seven typed kinds the parser will
dispatch to (architecture §4.6):
  - ModelBlock      (swagger:model)
  - RouteBlock      (swagger:route)
  - OperationBlock  (swagger:operation)
  - ParametersBlock (swagger:parameters)
  - ResponseBlock   (swagger:response)
  - MetaBlock       (swagger:meta)
  - UnboundBlock    (no annotation — struct field docstrings)

Block interface: Pos, Title, Description, Diagnostics, AnnotationKind,
plus iter.Seq-based iterators Properties/YAMLBlocks/Extensions. Using
iter.Seq (Go 1.23+, module targets 1.25) per §4.2: iterator form, not
Accept/Visit callbacks.

Support types:
  - Property {Keyword, Pos, Value, Typed, ItemsDepth}
  - TypedValue {Type, Number, Integer, Boolean, String}
  - RawYAML {Pos, Text}      (captured --- body; not parsed here)
  - Extension {Name, Pos, Value}

baseBlock (unexported, pointer-embedded) holds the shared state;
typed blocks embed it and add kind-specific positional fields. Exported
methods come through the embedding — external callers see the
interface surface and the kind-specific fields, nothing else.

AnnotationKind enum with String() and AnnotationKindFromName(name).
Labels factored into const block (labelRoute, labelModel, …) so the
same literal appears once — parser (P1.4) and analyzer will use the
same names for diagnostics.

Tests cover interface satisfaction (compile-time assertions), label
round-trip, full baseBlock accessor surface via a ModelBlock, and
iterator early-break semantics.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Wire the preprocess → lex → parse pipeline: Parse(cg, fset) returns a
typed Block from a Go ast.CommentGroup. ParseTokens(tokens) is the
same without the preprocessor, for LSP scenarios where token streams
are synthesized.

Algorithm:
  1. Scan tokens for the first ANNOTATION (or none).
  2. Build typed Block via buildTypedBlock dispatch:
       swagger:model        -> ModelBlock{Name}
       swagger:response     -> ResponseBlock{Name}
       swagger:parameters   -> ParametersBlock{TargetTypes}
       swagger:meta         -> MetaBlock
       swagger:route        -> RouteBlock{Method,Path,Tags,OpID}
       swagger:operation    -> OperationBlock{Method,Path,Tags,OpID}
       anything else        -> UnboundBlock carrying the kind
  3. parseTitleDesc on tokens before the annotation (first paragraph
     is title, rest joined as description).
  4. parseBody on tokens after the annotation: KEYWORD_VALUE and
     KEYWORD_BLOCK_HEAD -> Property; YAML_FENCE pairs -> RawYAML body
     captured via reconstructLine() best-effort.
  5. Never panic: unknown tokens are skipped; unmatched YAML fence
     emits CodeUnterminatedYAML diagnostic but still captures body.

Positional args for route/operation (P1.6 scope) are extracted here
already since the annotation token already carries them. Malformed
(<3 args) emits CodeInvalidAnnotation.

Bug fix in the preprocessor surfaced by the YAML fence tests: the
previous trimContentPrefix stripped leading `-`, which also ate the
`---` fence marker. `-` removed from the strip set. Bullet-list
dashes in description now survive to Text (arguably more correct
than v1's silent strip — flagged in the godoc).

Also renamed the internal `parser` type to `parseState` to avoid a
name clash with the go/parser package the tests import.

Tests (parser_test.go) cover:
  - ModelBlock / RouteBlock / ParametersBlock / UnboundBlock dispatch
  - route malformed -> CodeInvalidAnnotation diagnostic
  - nil CommentGroup -> empty UnboundBlock
  - title + description extraction (godoc-style ordering)
  - properties in order + item-depth preservation
  - block-head property (consumes:)
  - balanced YAML fence -> body captured
  - unterminated YAML fence -> diagnostic + body captured to EOF
  - "don't panic on anything weird" sweep

Known gap (P2.1): YAML bodies are reconstructed from already-
classified tokens, so indentation and exact punctuation are lost.
Good enough for kind/content assertions; not yet suitable for YAML
re-parsing. P2.1 will add fence-state tracking so raw bytes survive.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Populate Property.Typed for the four parse-time-convertible ValueTypes
(Number, Integer, Boolean, StringEnum); leave zero for the deferred
ones (String verbatim, CommaList, RawValue, RawBlock). Conversion
failures emit non-fatal diagnostics with the appropriate CodeInvalid*
code; Typed.Type stays ValueNone so consumers can distinguish "no
conversion performed" from "zero value successfully parsed".

Per-type rules:
  - Number: strconv.ParseFloat; accepts v1's leading comparison
    operator (<, <=, >, >=, =) which is captured in TypedValue.Op
    for the analyzer to use for exclusiveMaximum / exclusiveMinimum.
  - Integer: strconv.ParseInt base 10, rejects fractions.
  - Boolean: strict "true"/"false" case-insensitive (stdlib
    ParseBool is too lenient — it accepts "1"/"t"/"T"/"TRUE" etc.,
    which v1 rejects).
  - StringEnum: case-insensitive match against Keyword.Value.Values,
    canonicalised to the table spelling.

Adds TypedValue.Op to ast.go for the operator prefix.

Tests (typeconv_test.go) cover valid conversion for each type, each
operator variant, case-insensitive boolean + enum, the stdlib-lenient
"1" rejection, fraction-rejection for integers, and non-primitive
value types staying at zero TypedValue.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
The v1 rxRoutePrefix allows one leading godoc identifier before
swagger:route, e.g.:

    // ListPets swagger:route GET /pets tags listPets

matchGodocRoutePrefix scans a leading identifier (Unicode letter +
letters/digits/'_'/'-'), whitespace, then the literal "swagger:route"
terminated by whitespace or EOL. On match the lexer advances past the
prefix and feeds the rest to lexAnnotation, producing a normal
TokenAnnotation with Pos pointing at 's' of swagger:.

The exception is narrow by design:
  - Only "swagger:route" — other annotations (model, operation,
    parameters, …) keep the "must start the line" rule.
  - "swagger:routex" does NOT match (guarded).
  - Multi-word prefixes ("Do Foo swagger:route") do NOT match.

Positional Method/Path/Tags/OpID extraction was already in place
from P1.4 (fillOperationArgs). This commit just feeds the godoc-
prefixed form into the same path.

Tests cover: lexer-level classification (route-only exception,
'swagger:routex' rejection, multi-word-prefix rejection, Pos
advance past prefix), and end-to-end parser production of a proper
RouteBlock with zero diagnostics.

Also adds a nolint comment on Kind.String() for "route"/"operation"/
"meta"/"response" labels — goconst wants to share labelXxx from
ast.go but Kind (keyword context) and AnnotationKind (Block
dispatch) are intentionally separate concerns (architecture §4.6).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add checkContextValidity(base) to the parser's post-body pass. For
each Property, check whether the Keyword's Contexts list intersects
the allowed set for the block's AnnotationKind; if not, emit
CodeContextInvalid as SeverityWarning (non-fatal).

Mapping (AnnotationKind -> allowed Kind union):
  - AnnModel       -> {Schema, Items}
  - AnnParameters  -> {Param, Schema, Items}
  - AnnResponse    -> {Response, Schema, Header, Items}
  - AnnOperation   -> {Operation, Param, Schema, Header, Items, Response}
  - AnnRoute       -> {Route, Param, Schema, Header, Items, Response}
  - AnnMeta        -> {Meta, Schema}
  - everything else -> nil (skip check)

The sets are deliberately permissive: an operation body can host
schema properties, response headers, parameters, etc. Analyzers with
more context (Go type, enclosing struct) can enforce tighter rules.
UnboundBlock skips the check — its target context is determined by
the scanner from the enclosing declaration.

Diagnostic format:
  keyword "in" not valid under swagger:model (legal in: param)

Tests (context_test.go) cover:
  - legal keyword -> no diagnostic
  - illegal keyword -> exactly one warning, message mentions the
    keyword and its legal contexts
  - multiple illegal keywords -> one diagnostic each
  - keywords legal under multiple annotations (consumes: under Meta
    / Route / Operation) -> zero diagnostics
  - UnboundBlock skips the check
  - severity is Warning, never Error; Block is still produced

Cleanups shaken out by the new checks:
  - contextsOverlap uses slices.Contains
  - Kind.String() drops its now-unused nolint (goconst no longer
    triggers)
  - lexer_test.go references labelRoute instead of the literal "route"
  - context_test.go helper hardcodes CodeContextInvalid (unparam)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add productions_test.go with one focused test per §2.1 envelope
production:
  - annotation-only         (annotation w/o body)
  - title-only              (single paragraph, no description)
  - multi-paragraph desc    (verifies \n\n join)
  - properties-no-title     (annotation followed by keywords)
  - properties-interleaved  (TEXT between properties is dropped)
  - block-head property     (consumes: value-less)
  - empty YAML body         (--- immediately followed by ---)
  - multiple YAML blocks    (two independent fenced sections)
  - full-envelope order     (title → desc → props → yaml composed)

Plus coverage-fill:
  - Exhaustive String() on Kind, ValueType, TokenKind (previously
    only spot-tested).
  - Remaining AnnotationKind dispatch (ResponseBlock, MetaBlock,
    UnboundBlock from strfmt / alias / allOf / enum / ignore / file).

Coverage on internal/parsers/grammar/ rises from 85.1% to 93.3% —
above the ≥90% exit criterion for P1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Two catch-up items from the P1 flag queue.

1. Verbatim YAML body contract.
   - Add Line.Raw field — content with Go comment markers stripped
     and one layer of godoc decoration removed, but all YAML
     indentation preserved.
   - Per comment kind:
       `// foo`              -> Raw = "foo" (one godoc space stripped)
       `//   200: ok`        -> Raw = "  200: ok" (2-space indent kept)
       `/* * bar */` (cont.) -> Raw = "bar" (block continuation stripped)
       `/* ... */` no-cont.  -> Raw = full line (indentation kept)
   - Preprocessor threads a per-comment-kind rawStrip strategy into
     stripLine (stripSingleGodocSpace vs stripBlockContinuation).
   - Lexer now tracks one bit of state (inFence) and emits a new
     TokenRawLine kind carrying Line.Raw verbatim for every line
     between matched `---` fences.
   - Parser's collectYAMLBody simplifies to `post[i].Text`; the
     best-effort reconstructLine() from P1.4 is deleted outright.
   - `TestParseYAMLFenceBalanced` updated (via new production tests)
     to assert indentation is preserved; body lines come through
     ready for internal/parsers/yaml/ to consume.

2. Preprocessor `-` stripping — v2 divergence lock-in.
   - Decision: keep leading `-` in Line.Text (don't restore v1's
     silent strip). This preserves bullet-list semantics ("- foo"
     stays "- foo", not "foo") and keeps the `---` YAML fence
     marker detectable without special-casing.
   - trimContentPrefix already dropped `-` from the strip set in
     P1.4; this commit adds explicit tests that lock in the
     behavior (TestP110DashNotStrippedInProse + preprocessor-level
     TestP110DashPreservedOnlySurvivesVerbatimInText).
   - The parity harness in P4 will verify no real fixture depended
     on the old behavior; if one does, it's an explicit migration
     decision then, not a silent deferral.

Coverage on internal/parsers/grammar/ lands at 94.5%. P1.10 queue is
empty — no pending flags roll over into P4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Extend Property with `Body []string`, populated for KEYWORD_BLOCK_HEAD
tokens (consumes:, produces:, security:, responses:, parameters:,
extensions:, infoExtensions:, tos:, securityDefinitions:,
externalDocs:). Non-block-head properties keep Body nil.

collectBlockBody(base, post, i): after emitting the block-head
property, consume subsequent TEXT tokens into prop.Body. Collection
stops at the next structured token (KEYWORD_*, ANNOTATION,
YAML_FENCE, RAW_LINE, EOF) per legacy stop point S6 from the
implied-stop appendix. Interior blank tokens are deferred and
re-emitted as empty body lines only if more text follows — trailing
blanks are dropped.

Block.Properties() iteration order is unchanged; each block-head
Property now carries its body inline alongside the keyword metadata.
Analyzers (P5 bridge taggers) read prop.Body directly; per-keyword
tokenization (MIME types for consumes/produces, security mappings,
etc.) is their concern.

Tests (blockbody_test.go): single-block capture, stop-at-next-keyword,
stop-at-annotation, stop-at-YAML-fence, trailing-blanks-trimmed,
non-block-keyword-unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
When collectBlockBody encounters an `extensions:` or `infoExtensions:`
block head, each body TEXT token is also parsed into an Extension
{Name, Value, Pos} and appended to the Block's extensions slice.
Property.Body is still populated with the raw lines for analyzers
that prefer verbatim input.

parseExtensionLine splits "name: value" per-line; whitespace is
trimmed, blank-or-malformed lines are skipped. Name validation (the
`x-*` requirement) is deferred to P2.4.

Block.Extensions() returns the flat iterator; callers use it
uniformly regardless of whether the source used `extensions:` or
`infoExtensions:` (or, later, extension blocks that appear inline
inside other contexts).

Tests (extensions_test.go): basic extraction, infoExtensions path,
per-line source positions, parallel Body+Extensions survival,
scoping (consumes: body not scraped), malformed-line skipping.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add isExtensionName(s) — a well-formedness check mirroring the v1
rxAllowedExtensions pattern `^[Xx]-` (case-tolerant prefix, at least
one suffix character required). When collectBlockBody is in an
extensions block and extracts a name that fails the check, emit a
SeverityWarning CodeInvalidExtension diagnostic at the offending
line's Pos.

Non-fatal by design:
  - The Extension is still appended to Block.Extensions() so
    analyzers / LSP can decide policy (surface to user, drop, etc.).
  - Pairs with the broader "diagnostics accumulate, don't throw"
    principle (architecture §4.3, tasks P1.4).

Tests: `TestExtensionsInvalidNameDiagnostic` (invalid name emits
warning; extension still collected) and
`TestExtensionsAcceptsUppercaseX` (X- prefix accepted per v1
behavior).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Create internal/parsers/yaml/ — a thin wrapper around
go.yaml.in/yaml/v3 for parsing the RawYAML bodies isolated by
internal/parsers/grammar/. The grammar parser stays YAML-free
(verified: `go list -f '{{.Imports}}' ./internal/parsers/grammar/`
has no yaml entry) per architecture §3.3, §5.1.

Exposed surface:
  - yaml.Parse(body) -> (any, error):
      Unmarshal into a generic value (map/slice/scalar). Returns
      (nil, nil) for an empty body so callers don't branch on
      error-vs-nil for the "fence with no content" case.
  - yaml.ParseInto(body, dst) -> error:
      Unmarshal into a caller-defined struct. Empty body is a
      no-op, leaving dst at its zero value.

Both wrap the underlying error with a "yaml:" prefix so downstream
diagnostics can distinguish YAML parsing failures from other errors
without type-asserting.

Tests: empty body, flat map, nested structure with non-string keys
(map[any]any as YAML v3 returns), invalid YAML error wrapping,
struct unmarshal, empty+struct no-op.

Pattern: this subpackage establishes the seam for any future
sub-language (enum variants per W2, richer example syntax per W3,
private-comment bodies per W4). Each gets its own
internal/parsers/<name>/ package; the grammar parser never
imports any of them.

Completes P2. P1 and P2 are both fully green; ready for P3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Expose the parser as a three-method interface — Parse, ParseText,
ParseAs — behind NewParser(fset). The interface is the injection
seam the P5 bridge-taggers and property-based builder tests need
(architecture §5.3): tests construct Block values directly and
feed them through a mock Parser without re-lexing text.

  - Parse(cg)       primary path — preprocess -> lex -> parse
  - ParseText(t,p)  LSP / test path — raw text with position
  - ParseAs(k,t,p)  LSP kind-hint path (§4.6) — prepends a
                    synthetic swagger:<kind> so dispatch goes the
                    right way even when the editor hasn't typed
                    the annotation line yet.

preprocessText is a small helper that turns raw lines into Line
values carrying both Text and Raw (identical — no Go comment
markers to strip) with monotonic positions from basePos.

Backward compat: the existing package-level Parse(cg, fset) is now
a convenience wrapper around NewParser(fset).Parse(cg). All
existing tests pass unmodified.

Tests: interface satisfaction (compile-time), each method exercised
end-to-end, ParseAs forces dispatch on a property-only body, the
package-level Parse wrapper still works.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add the Option functional-options layer on NewParser:

    type Option func(*parserImpl)
    func NewParser(fset, opts ...Option) Parser

Ship WithDiagnosticSink(cb) as the first concrete Option — invokes
the callback for every Diagnostic the parser emits, in parallel with
the block-local accumulation that already exists. The LSP seam
(architecture §4.3): diagnostics must be surfaced as they're
produced, not batched until parse completes.

Under the hood, parseState gains a `sink func(Diagnostic)` field and
an `emit(d)` helper that fans out to sink + local slice. All nine
`p.diag = append(p.diag, …)` sites converted to `p.emit(…)`. The
package-level `ParseTokens` path keeps a nil sink (no callback) —
only parserImpl constructs a parseState with the Option's sink.

Logger option intentionally not added yet — parsers with no trace
output haven't needed one; when an LSP or CLI developer wants
verbose output we'll add it then. Option type is variadic so the
addition won't break callers.

funcorder lint honored: ParseAs moved above runParser so exported
methods cluster before unexported ones in parserImpl.

Tests: sink receives every diagnostic (matches Block.Diagnostics()
count), default behavior (no option) unchanged, codes in stream
match expected (CodeContextInvalid + CodeInvalidNumber from a
deliberately wonky source).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add six lookup methods to Block (and baseBlock):

    Has(name) bool
    GetFloat(name) (float64, bool)
    GetInt(name) (int64, bool)
    GetBool(name) (bool, bool)
    GetString(name) (string, bool)
    GetList(name) ([]string, bool)

All share a findProperty helper that matches canonical-name OR alias,
case-insensitively. `Max`, `max`, `MAXIMUM` all resolve to the same
`maximum` Property.

Typed getters return (zero, false) when the keyword is absent or its
ValueType doesn't match — so callers write:

    if n, ok := block.GetFloat("maximum"); ok {
        schema.Maximum = n
    }

GetString is the permissive fallback: StringEnum returns the
canonical (table-spelled) value; everything else returns the raw
Property.Value. GetList unifies the two shapes of "a list of things":
a block-head Property's Body (consumes:, security:, …) is returned
directly; a ValueCommaList value (enum, schemes, …) is split on
commas with whitespace trimmed. GetList returns a defensive copy so
mutating the returned slice can't corrupt Block state.

Block interface grows from 9 to 14 methods. interfacebloat lint
silenced with a rationale: Block is the single consumer contract
for both builders and LSP; splitting into BlockInfo / BlockIterators
/ BlockAccessors would introduce friction at every call site and
gain nothing — there's no implementation other than baseBlock-
embedded typed kinds.

Coverage on internal/parsers/grammar/ holds at 93.8%. Full repo
test suite green.

Tests (accessors_test.go): Has + absent, alias + case-insensitive
lookup, GetFloat/Int/Bool/String happy paths, StringEnum
canonicalization, GetList for CommaList and Body shapes, defensive-
copy behavior, type-mismatch returning false.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Per Fred's review: the Parser interface exists to enable mock
implementations in tests, not to support multiple production
variants. NewParser should return the concrete type so IDE
discoverability and docs-on-hover work; the interface stays as the
mock contract.

Changes:
  - Rename parserImpl -> DefaultParser (exported).
  - NewParser(fset, opts...) now returns *DefaultParser instead of
    Parser. Callers who previously wrote `var p Parser =
    NewParser(...)` continue to work via implicit satisfaction.
  - Drop the ireturn nolint on NewParser (no longer applies).
  - Parser interface godoc reframed: "consumer contract / mock seam".
  - Compile-time assertion in parser_api_test.go updated to
    `(*DefaultParser)(nil)`.

Per-method ireturn nolints on Parse/ParseText/ParseAs stay — those
still return Block (the AST's polymorphic family).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…mentView

Create internal/parsers/grammar/grammar_test/ (package grammartest)
with the parity-harness plumbing the P5 bridge-tagger migration
depends on:

  - NormalizedCommentView — parser-agnostic, diff-friendly shape
    capturing AnnotationKind + kind-specific positional args (Name,
    Method/Path/Tags/OpID, TargetTypes), Title/Description, typed
    Properties (value + Typed subtree), YAML bodies, Extensions, and
    Diagnostics sorted by (code, severity) for determinism.

  - ViewFromBlock(b grammar.Block) NormalizedCommentView — the v2
    adapter. Uses a type-switch over the block family that is
    explicit per kind (future AnnotationKind additions fail closed —
    they just don't populate args).

  - ParseSourceToViews(t, src) — test helper: parse a Go snippet,
    walk its declarations, normalize each attached comment group.

  - AssertGoldenView(t, path, views) — JSON-snapshot diff driver
    honoring UPDATE_GOLDEN=1 (matches the existing scantest convention).

Seven committed golden fixtures cover the common comment-group shapes:
simple_model, route_with_tags, operation_with_yaml,
parameters_with_validations, meta_with_extensions, unbound_with_bullet,
and context_invalid_diag. These lock in the v2 parser's output; P5
commits will extend the set per builder.

Bug caught by the harness and fixed inline: UnboundBlock previously
dropped all prose because the parser routed the entire token stream
into parseBody. Added findBodyStart(tokens) — for an annotation-less
comment group, tokens are split at the first keyword/YAML-fence so
the prose prelude (e.g. a struct-field docstring) is recovered into
Title/Description. This is the canonical use case for UnboundBlock
and would have been a recurring bug in P5 without the harness catch.

P4.3 (Options.UseGrammarParser feature flag + scanner wiring) is
deferred to the first P5 bridge-tagger commit, where it has an
actual consumer — pre-P5 the flag has no path to exit through.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add fixtures/enhancements/enum-overrides/ and TestCoverage_EnumOverrides
to pin the v1 behavior for five enum cases that W2 needs to answer
before P5.1:

  A. swagger:enum + matching consts              -> const inference
  B. inline comma-list only, no consts           -> inline
  C. inline JSON-array only, no consts           -> inline
  D. swagger:enum with NO matching consts        -> empty schema
  E. swagger:enum + matching consts + inline     -> inline WINS

The golden snapshot confirms Fred's proposed override semantics hold
in v1 today: case E renders enum=["urgent","normal"] from the inline
annotation even though PriorityE has three const values ("low",
"medium", "high"). So the v2 parser migration inherits this rule
rather than diverging.

Things the golden also surfaces (non-parity items, captured here so
they aren't re-discovered during P5):

  - comma-list splitter preserves leading whitespace: B renders
    ["low"," medium"," high"] with literal spaces. A v1 quirk; P5
    bridge-tagger should strip per-value whitespace (or the new
    internal/parsers/enum/ sub-parser should).
  - case D emits a property with NO type and NO enum. The
    swagger:enum annotation is silently ignored when no consts
    match. P5 should surface a diagnostic ("swagger:enum
    TypeName resolved to zero values").
  - case E retains x-go-enum-desc describing the const values even
    though the inline override wins. Stale vendor-extension; P5
    should drop x-go-enum-desc when the inline override takes
    precedence.

This golden is the factual v1 reference. Any v2 divergence during
P5.1 will be explicit (new golden, documented rationale).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Windows CI fails on the new grammar_test AssertGoldenView harness
because git's default core.autocrlf=true converts line endings on
checkout. The existing internal/scantest harness sidesteps this by
comparing JSON semantically (assert.JSONEqT), but AssertGoldenView
does a byte-equal compare — exact by design so it catches
field-order regressions and trailing-newline changes. Byte-equal +
platform-dependent checkout = broken on Windows.

Two narrow fixes:

  - Normalise CRLF -> LF on the read side only (bytes.ReplaceAll).
    `got` is always freshly produced by json.MarshalIndent + a
    trailing '\n', so it's LF-only; normalising `want` after
    ReadFile makes the compare platform-independent without
    loosening the "exact bytes expected" invariant.

  - Add .gitattributes pinning *.json and both golden directories to
    `eol=lf`. Belt-and-suspenders: prevents autocrlf from
    corrupting goldens for any other tool that does byte-level
    inspection (e.g., external diffs, editor tooling).

Both fixes together mean a fresh Windows checkout produces LF-only
golden files AND the harness tolerates CRLF-infected older
checkouts until they're refreshed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add the feature flag the P5 migration uses to coexist the legacy
regex-based tagger pipeline and the new grammar-parser +
bridge-tagger pipeline. Default false: no behavior change yet;
bridge-tagger consumers land in subsequent P5.x commits.

The flag is plumbing only at this step. Its roles unlock across
the upcoming P5 work:

  - Routing seam in the scanner/builders (step 4): when true, the
    comment-group dispatch goes through grammar.Parser; when false,
    the legacy taggers run.
  - Dual-path parity harness (step 3b): runs codescan.Run twice per
    fixture, once per flag value, diffs the results via the
    NormalizedCommentView v1/v2 adapters. This is how every P5.x
    migration verifies parity.
  - P6 cutover: flag removed, grammar parser becomes the only path.

codescan.Options is a type alias for scanner.Options, so
codescan.Run callers see the field immediately — no public-API
shim needed.

Planning: .claude/plans/p5-builder-migrations.md §5.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add internal/integration/parity_test.go: runs every fixture twice,
once per UseGrammarParser value, and asserts the resulting
*spec.Swagger values are JSON-equal. Twenty-one fixtures covering
enhancements/* and goparsing/petstore + goparsing/bookings.

Design rationale — spec-level compare, not view-level:
  - Measures the user-observable contract (the spec). v1 and v2
    producing identical specs through different internal paths is
    by definition not a user-observable difference.
  - Reuses the existing fixture corpus (21 TestCoverage_* fixtures
    become 21 parity cases) with zero reconstruction.
  - No lossy reverse-engineering from post-build spec to
    per-comment-group views — the alternative v1-adapter approach
    would have had to map Schema.Properties["foo"] back to its
    source *ast.CommentGroup, which is intrinsically fragile
    (multi-comment sites, synthesised fields).
  - Failure messages surface the exact diverging JSON path, which
    is what you need to debug.

See .claude/plans/p5-builder-migrations.md §5 for the full
discussion (rejecting the view-level adapter).

With the flag currently a no-op, the suite passes trivially.
Validates the harness plumbing (parallel t.Run, Options cloning,
WorkDir injection, error handling) before bridge-taggers land in
step 6. When the flag starts flipping pipelines, TestParity
becomes the per-commit safety net.

Excluded fixtures:
  - UnknownAnnotation — intentionally an error-expected path, no
    spec to compare.
  - malformed/* — same reason.
These error paths get their own non-parity coverage in the
existing TestCoverage_* suite.

Tactical test — **P6 cutover deletes this file in the same commit
that removes Options.UseGrammarParser** (grammar-parser-tasks.md
P6.4 records the obligation; p5-builder-migrations.md §5.3 the
rationale). With no flag, the test has no dual paths to diff and
would become pure CI burden.

Also cross-referenced from forthcoming-features.md §5.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Per W2 §3 decision, introduce the enum sub-parser that the schema /
parameters / responses bridge-taggers will call at step 8+ of the
P5 migration. Sibling of internal/parsers/yaml/ — imported only by
the analyzer layer; the main grammar parser stays oblivious to
enum value shape (verified: grammar package does not import enum).

API — deliberately minimal:

    func Parse(raw string) (values []any, fallbackErr error)

Shape detection:
  - Leading `[` (after TrimSpace) -> JSON-array path via
    encoding/json. Non-scalar values (objects, arrays, null) pass
    through natively, supporting OAI's any-JSON-type enum semantics
    (W2 §2.4 / forthcoming §1.3 audit).
  - Otherwise -> comma-list, each value TrimSpace'd. This **fixes
    v1's case-B whitespace quirk** (W2 §2.6): where v1 produced
    `["red", " green", " blue"]`, v2 produces `["red","green","blue"]`.

Narrow JSON detection by design: only a leading `[` triggers the
JSON path. Inputs like `{"k":"v"}`, bare `null`, `42` go through
the comma-list path (matching v1 parity: users who want structured
values wrap them in `[...]`). Documented and locked in via
TestParseDetectionIsNarrow.

Fallback behavior: when input LOOKS like JSON (leading `[`) but
fails to parse, Parse falls back to comma-list AND returns a
non-nil fallbackErr wrapped with an "enum:" prefix. Bridge-taggers
surface this as a SeverityWarning diagnostic rather than aborting,
matching v1's forgiving semantics.

Empty/whitespace-only input -> (nil, nil). A fenced but empty enum
is a no-op.

Tests (enum_test.go) exercise:
  - empty/whitespace handling
  - comma-list: basic, whitespace-trimmed (case-B fix),
    tab-separated, single value, dropped-empty-entries
  - JSON array: strings, numbers (float64 per JSON rules), mixed
    types incl. objects / arrays / null, commas-inside-strings
    (which comma-list can't do), leading-whitespace tolerance
  - Fallback path: malformed JSON retains error, error prefix
  - Narrow-detection contract: {"k":"v"} / null / 42 stay
    comma-list single-values

Exported sentinel ErrEmptyOrNullArray satisfies err113 for an
ambiguous-shape JSON result the caller may want to log.

Type coercion (e.g., "42" -> int64(42) when field type is int) is
deliberately NOT in the sub-parser — that's bridge-tagger
territory (W2 §3 / P5.1 plan doc §6). The sub-parser returns
JSON-inferred types; the bridge-tagger coerces using go/types
info at the call site.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
fredbi and others added 30 commits June 1, 2026 22:52
After the grammar2 migration, the only remaining consumers of
internal/parsers are scanner (annotation classification) and the
path-annotation parsers used by builders/routes / operations to
read scanner-produced ParsedPathContent. Every other matcher
helper from the v1 regex era now had zero call sites.

Removed (with their backing regexes):
- HasAnnotation, IsAliasParam
- AllOfMember, AllOfName       (was rxAllOf)
- FileParam                    (was rxFileUpload)
- Ignored                      (was rxIgnoreOverride)
- AliasParam                   (was rxAlias)
- StrfmtName                   (was rxStrFmt)
- ParamLocation                (was rxIn)
- EnumName                     (was rxEnum)
- NameOverride                 (was rxName)
- DefaultName                  (was rxDefault)
- TypeName                     (was rxType)
- Rxf + every *Fmt format
  (rxMaximumFmt, rxMinimumFmt, rxMultipleOfFmt, rxMaxLengthFmt,
   rxMinLengthFmt, rxPatternFmt, rxCollectionFormatFmt, rxEnumFmt,
   rxDefaultFmt, rxExampleFmt, rxMaxItemsFmt, rxMinItemsFmt,
   rxUniqueFmt, rxItemsPrefixFmt) and rxRequired

The internal helpers commentMatcher and commentSubMatcher fall
with their last users; commentBlankSubMatcher and
commentMultipleSubMatcher survive (still used by ModelOverride /
ResponseOverride / ParametersOverride).

Tests updated:
- TestHasAnnotation, TestIsAliasParam, TestCommentMatcher,
  TestCommentSubMatcher gone.
- TestSchemaValueExtractors trimmed to the rxModelOverride +
  rxParametersOverride cases that still have production callers;
  cartesianJoin / titleCaseVariants / verifyBoolean /
  verifyIntegerMinMaxManyWords / verifyMinMax / verifyNumeric2Words
  / verifyRegexpArgs / makeMinMax dropped along with the dead
  regex-matrix coverage they supported.
- TestExtractAnnotation, TestCommentBlankSubMatcher,
  TestCommentMultipleSubMatcher kept verbatim.

Diff: 4 files, −492 net.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
After the grammar migration left helpers/ with one symbol per use
case, the package's "grab-bag of bridge utilities" charter no
longer applied. Each survivor moves to the place that owns it now.

Deleted outright (zero callers):
- Setter (lines.go)             — closure factory for SectionedParser
                                  callbacks; nothing left to set.
- CleanupScannerLines (lines.go) — was scaffolding for
                                  CollectScannerTitleDescription and
                                  the old extensions parser.
- CollectScannerTitleDescription + its three regexes (title_desc.go)
                                  — every consumer now reads
                                  block.Title() / block.Description()
                                  from grammar; the v1 heuristics
                                  (blank-line split, ATX heading
                                  promotion, punctuation-ends → title)
                                  live in grammar's lexer as the
                                  classifyProseRun heuristics.

Relocations:
- GetEnumBasicLitValue → internal/scanner/enum_value.go as
                         unexported enumBasicLitValue. Used by
                         exactly one site in scan_context.go; the
                         function coerces Go AST basic literals to
                         runtime values, which is scanner-layer
                         concern, not parsers-layer.
- SchemesList / SecurityRequirements → internal/builders/common/
                         routemeta.go. Both are shared by
                         builders/routes and builders/spec (and only
                         those two); common already hosts shared
                         build-layer utilities.

Inlined at the call site:
- JoinDropLast + DropEmpty had one combined caller in spec/walker.go
  (the Terms-Of-Service body builder). DropEmpty + JoinDropLast is
  equivalent to "join non-blank lines with \n" — landed as a small
  local joinNonBlank in spec/walker.go.

internal/parsers/helpers/ directory removed. CLAUDE.md package
layout refreshed.

Test suite green (16 packages, no golden drift).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Audit of regexp imports across internal/ surfaced three small
matchers in the build path that did simple byte-class work the
regex engine wasn't really earning. All ported to hand-rolled
byte loops in the same files:

- yaml/list.go:rxLineLeader (`^[\p{Zs}\t/\*]*\|?`) — strip leading
  whitespace/tab/slash/asterisk run plus optional `|` from a YAML
  list-body line before unmarshal. Replaced with stripListLeader:
  ASCII-only byte loop, same intent.

- spec/walker.go:rxStripTitleComments
  (`^[^\p{L}]*[Pp]ackage\p{Zs}+[^\p{Zs}]+\p{Zs}*`) — strip the
  Go-doc-comment `Package <ident>` prefix off a meta title.
  Replaced with stripPackagePrefix: detect leading non-letter run,
  match Package/package, eat whitespace, eat identifier, eat
  trailing whitespace. ASCII-only — every Go source title in the
  fixtures starts with an ASCII letter and uses ASCII spaces.

- spec/walker.go:httpFTPScheme (`(?:(?:ht|f)tp|ws)s?://`) —
  locate the URL prefix in a `Name <email> URL` contact line.
  Replaced with a six-entry urlSchemes lookup table and a
  find-leftmost-match scan. Same prefix set, no engine spin-up.

Remaining regexp imports under internal/ — all kept by design:
  parsers/{regexprs,matchers,parsed_path_content}.go — scanner
      classification fast path; restructuring is deferred per the
      "ast bits of interest" work item.
  scanner/index.go — user-facing package include/exclude patterns
      take regex by API contract.
  schema/walker.go — RE2 compile check on user-supplied `pattern:`
      values; correctness requires the actual engine.

Test suite green; no golden drift.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Verbatim 5-line buildOption helper was copy-pasted in
parameters/parameters.go and responses/responses.go. Both pick
between schema.WithType (for body schemas) and schema.WithSimpleSchema
(for non-body parameter / header sites) based on typable.In().

Hoist into the schema package where the Build modes already live:

    func OptionFor(tpe types.Type, tgt ifaces.SwaggerTypable) Option {
        if tgt.In() == "body" { return WithType(tpe, tgt) }
        return WithSimpleSchema(tpe, tgt, tgt.In())
    }

Both consumers already import schema, so swapping the local helper
for schema.OptionFor at every call site is mechanical. The local
buildOption definitions drop.

If a third Build mode ever lands, or the body/non-body gate gains
nuance (allowEmptyValue exception, etc.), the dispatch becomes a
one-place edit.

No behaviour change. Full suite green; no golden drift.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…esList

Grammar already had an unexported `splitCommaList` powering
Block.GetList for ShapeCommaList keywords. Export it as
grammar.SplitCommaList and reuse it at the two routes/spec call
sites that read a raw Property.Value (the dispatchers can't use
Block.GetList without restructuring the keyword switch).

common.SchemesList — a 10-line wrapper around the same algorithm
— retires. common/routemeta.go shrinks to just SecurityRequirements,
which has no equivalent on grammar's typed surface (the
"name: scope1, scope2 word truncates" line shape with the v1 quirk
is meta/route body specific).

A short doc-block on routemeta.go now explains why the file
exists at all — single utility, two callers, retire-or-hoist
on the next data point.

No behaviour change. Full suite green; no golden drift.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
New fixture under fixtures/enhancements/parameters-map-postdecl/
exercises a long-standing bug in parameters.buildFromFieldMap:
the schema sub-builder's PostDeclarations are silently dropped
instead of being propagated to the parameters builder's chain.

Repro shape:
- LocalItem — NOT swagger:model annotated; only reachable via the
  map field below.
- MapParams (swagger:parameters mapBody) — body field of type
  map[string]LocalItem.
- swagger:operation POST /items mapBody — declares the operation
  so the parameters declaration has somewhere to attach.

The buggy golden captured here shows the inconsistency clearly:

  "schema": {
    "type": "object",
    "additionalProperties": { "$ref": "#/definitions/LocalItem" }
  }

…but the spec has no "definitions" section at all. The $ref
points at a definition that does not exist. Downstream tooling
(generators, validators) would either fail loudly or silently
drop the property type.

The bug exists because every sibling builder method —
buildFromFieldStruct, buildFromFieldInterface, buildNamedField,
buildFieldAlias — propagates sb.PostDeclarations(). Only
buildFromFieldMap forgets the loop. Likely an oversight from when
the schema-builder factor-out landed (Stream M2.5).

The fix is one for-loop. Landing it in the follow-up commit
regenerates this golden — the diff there will show LocalItem
appearing in definitions, witnessing the resolution against this
witnessed-broken baseline.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…uild

The one-loop fix to the bug witnessed in the previous commit:
parameters.buildFromFieldMap now propagates the schema
sub-builder's PostDeclarations to the parent builder's chain,
matching what every sibling buildFromFieldXxx method already
does.

The witnessing golden delta:

+ "definitions": {
+   "LocalItem": {
+     "description": "LocalItem — NOT annotated; reachable only via the map field on\nMapParams below.",
+     "type": "object",
+     "properties": {
+       "name": { "type": "string", "x-go-name": "Name" },
+       "tag":  { "type": "string", "x-go-name": "Tag"  }
+     },
+     "x-go-package": "github.com/go-openapi/codescan/fixtures/enhancements/parameters-map-postdecl"
+   }
+ }

The earlier golden referenced #/definitions/LocalItem but had no
definitions block. Post-fix, the referent exists.

No other golden drift — verified with full go test ./...

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Grammar gains a sibling sub-parser for the `Security:` block body —
internal/parsers/security/, modelled on the yaml sub-parser that
already handles extensions:. The lex-time dispatch in
grammar.emitRawBlock calls security.Parse on the `security:` raw
body and stores the typed []security.Requirement on baseBlock.
Block exposes it via SecurityRequirements().

Builders consume it the same way they consume Extensions: read off
the block after the property loop, drop the per-keyword dispatch
case. routes/walker.go and spec/walker.go each shed the
KwSecurity case in dispatchRouteKeyword / dispatchMetaSimple.

common/routemeta.go retires entirely — SecurityRequirements was
its last function (SchemesList moved to grammar.SplitCommaList in
M6.4-C3). The common package shrinks back to just *common.Builder.

The v1 quirk (per-scope whitespace truncation at the first word)
is codified in security.Parse with a comment explaining the
intent. No behaviour change vs the prior common.SecurityRequirements
implementation; fixtures only exercise single-word scopes today.

Full suite green; no golden drift.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…Block

Two single-line value parsers move from spec/walker.go into
grammar — option 1 in our M6.4 discussion (trivial enough that a
sibling sub-parser is overkill; the parse logic lives in
grammar/meta_info.go).

New types and accessors on grammar.Block:
- type Contact struct { Name, Email, URL string }
- type License struct { Name, URL string }
- Block.Contact() (Contact, error)  — error surfaces malformed
                                       `Name <email>` heads
- Block.License() (License, bool)   — bool is "found", no parse
                                       failure path

The error return on Contact preserves the existing v1 contract:
TestMalformed_BadContact still aborts the build on a malformed
contact line. License has no parse failure mode (Name and URL
just split on the URL scheme), so the simpler (License, bool)
shape applies.

Both accessors are lazy — they iterate properties on call.
Single-line input, trivial cost; no lexer changes needed.

Migrated out of spec/walker.go:
- parseContactInfo, parseLicense, splitURL, urlSchemes (~60 lines)
- net/mail import

spec/walker.go reads the typed accessors after the property loop,
same shape as block.SecurityRequirements() from the previous
commit.

Full suite green; no golden drift; TestMalformed_BadContact still
fails as expected on malformed input.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
… refactor

Captures the legacy swagger:route body sub-language behaviour (Parameters: and
Responses: blocks parsed by builders/routes/body_params.go and
builders/routes/body_responses.go) across 28 focused fixtures with golden JSON.

Existing route coverage was petstore-shaped (Consumes/Produces/Schemes/
Responses/Security/Extensions) with ZERO integration goldens exercising the
`+ name:` chunk grammar — when the legacy parsers retire in M6.5-C the in-
package unit-test coverage retires with them. This commit closes that gap.

These goldens are WITNESS goldens for the M6.5 refactor: they lock the current
buggy output so M6.5-B/C diffs surface each fix landing as the rewrite
naturally delivers checkShape diagnostics (M6.5-B) and clean routebody design
(M6.5-C). Quirks logged as Q14-Q22 in .claude/plans/observed-quirks.md.

Coverage:
- Parameters block: path/query/header/form/body in: values; string/integer/
  number/boolean/array types; min/max/minlength/maxlength/format/default/
  enum/required/allowempty validations; body refs with [] and [][] array
  nesting; chained multi-param; unknown-key silent drop; empty + chunk;
  schema overrides on body refs.
- Responses block: positional (untagged), tagged body:/response:/description:;
  mixed body+response refs; description-only; default; array nesting; empty
  value; definition-fallback (found); ref-not-found (dangling); multi-codes.
- Combined: full petstore-shape route; multi-method same-path.
- Quirk fixtures: space-body misinterpretation, ref-not-found dangling.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…lers package

Pure code motion. The level-0 and items-level dispatchers used by the
parameters and responses builders move into the handlers package
alongside the per-shape callback factories they orchestrate. Tests and
integration goldens unchanged: byte-identical output, verified against
the M6.5-PRE witness suite.

Moved into the new handlers/dispatch_simple.go:

- DispatchParamLevel0 (from parameters/walker.go:dispatchParamLevel0)
- DispatchHeaderLevel0 (from responses/walker.go:dispatchHeaderLevel0)
- DispatchItemsLevel — collapses parameters' walkItemsLevel and
  responses' walkHeaderItemsLevel; their bodies were byte-identical
  modulo the diagnostic source.
- paramValidations adapter (from parameters/typable.go)
- headerValidations adapter (from responses/typable.go)
- paramRequiredBool helper (private to handlers)

The parameters/responses walkers now hold only AST-side orchestration
(applyBlockToField, collectXxxItemsLevels) and delegate the
grammar-side dispatch via handlers.DispatchParamLevel0 /
handlers.DispatchHeaderLevel0 / handlers.DispatchItemsLevel. paramTypable
and responseTypable stay in their packages — they implement
SwaggerTypable for upstream wiring, distinct from the SimpleSchema
validation adapters that moved.

This unblocks M6.5-C: routebody-emitted grammar.Block instances will
dispatch through the same handlers seam parameters/responses use today,
eliminating the duplication of validation logic that
builders/routes/body_params.go currently carries.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…ackage

Pure code motion. The full-Schema dispatch quartet (level-0 and items-
level dispatchers, plus the per-shape Number/Integer/Bool/String/Raw
handler factories) moves out of schema/walker.go into the handlers
package, alongside the SimpleSchema dispatchers already there from
M6.5-A. Tests and integration goldens unchanged: byte-identical output
verified against the M6.5-PRE witness suite.

Moved into the new handlers/dispatch_schema.go:

- DispatchSchemaLevel0 (was schema/walker.go:walkSchemaLevel)
- DispatchSchemaItemsLevel (was schema/walker.go:walkItemsLevel)
- SchemaValidations adapter + NewSchemaValidations constructor
  (was schema/typable.go:schemaValidations)
- ApplyPattern (was schema/walker.go:applyPattern; exported because
  schema's refOverrideCollector still uses it)
- SetRequired, SetDiscriminator (was schema/walker.go:setRequired/
  setDiscriminator; SetRequired exported for refOverrideCollector)
- SchemaTypeOf (was schema/walker.go:schemaTypeOf; exported for
  refOverrideCollector)
- clearStaleEnumDesc (was schema/extensions.go; unexported, called
  from SchemaValidations.SetEnum)
- schemaNumberHandler, schemaIntegerHandler, schemaBoolHandler,
  schemaStringHandler, schemaRawHandler (per-shape factory closures,
  unexported)
- checkShape (unexported gate)

SchemaOptions{SimpleSchemaMode bool} replaces the Builder.simpleSchema
flag at the dispatcher boundary. The schema package's Builder exposes
a one-line schemaOpts() helper that packages its mode flag into the
struct on each call. The SimpleSchema-mode gate inside schemaBoolHandler
behaves identically — it emits CodeUnsupportedInSimpleSchema for
full-Schema-only keywords and silently skips required:.

The schema package keeps refOverrideCollector (field-on-struct
$ref-override allOf rewrite — not relevant to routes or other
consumers), applyBlockToDecl / applyBlockToField (AST-side entry
points), applyToRefField (collector driver), and flattenItemsTargets
(AST array-layer walk). The extensions.go file went away entirely
when clearStaleEnumDesc moved.

This unblocks M6.5-C: routebody-emitted grammar.Block instances for
body params/responses dispatch through handlers.DispatchSchemaLevel0
with no enclosing (nil) and no name (""), inheriting checkShape
gating + ApplyPattern's RE2 hygiene check + ParseDefault coercion —
the same validation engine struct fields go through today, without
re-implementing any of it in builders/routes/body_params.go.

This commit also delivers Q15 + Q16 fix-readiness: the checkShape
gate is now uniformly available on the routes dispatch path that
M6.5-C will wire up. (The golden delta for Q15/Q16 lands when M6.5-C
ships, not here, because routes still uses its legacy body_params.go
parser at this point.)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Replaces the legacy regex+state-machine parsers in
builders/routes/body_params.go and body_responses.go (479 LOC retired)
with a hand-rolled tokenizer in the new internal/parsers/routebody
package + per-decl dispatch through handlers.DispatchParamLevel0
(SimpleSchema) and handlers.DispatchSchemaLevel0 (body schema). The
synthetic grammar.Block constructor (NewSyntheticBlock) lets routebody
emit standard Blocks the dispatchers consume without bespoke wiring.

# Architecture

- internal/parsers/routebody/{doc,parameters,responses,diag}.go: pure
  tokenizer over the legacy `+ name:` chunk grammar (parameters) and
  the positional `<code>: <tokens>` line grammar (responses). Returns
  typed ParamDecl / ResponseDecl values; ParamDecl carries the
  validation properties as a synthetic Block.

- internal/parsers/grammar/synthetic.go: NewSyntheticBlock factory.

- internal/builders/routes/walker.go: rewritten dispatchParameters /
  dispatchResponses orchestrate routebody output. Body params type-
  gate via validations.IsLegalForType (Q15/Q16-equivalent on routes
  SimpleSchema); body schemas via DispatchSchemaLevel0's checkShape.

# Fixes (Q14-Q22 per .claude/plans/observed-quirks.md)

The PRE goldens shift to witness each fix landing as planned:

- Q14: body param description no longer duplicates onto param.Schema.
- Q15/Q16: min/max/format on body refs now uniformly gate via
  checkShape; SimpleSchema params type-gate via typeGateBlock.
- Q17: bare `+` chunks now drop (with diagnostic) rather than emit
  empty parameter objects.
- Q18: unknown parameter keywords emit CodeInvalidAnnotation rather
  than silent-drop.
- Q19: empty `204:` value handled cleanly.
- Q20: `body Foo` space-separated form detected as typo; drop with
  diagnostic rather than mis-parse as response="body".
- Q21: leading-space artifact on tail-of-line descriptions removed
  (routebody strips leading whitespace from the accumulator).
- Q22: dangling response refs (name in neither responses nor
  definitions) drop with diagnostic rather than emit invalid $ref.

# Accept `- ` as chunk-start alias

Routebody accepts `- ` alongside the canonical `+ ` chunk-start sigil,
preparing for future YAML-style authoring (v2). No further YAML
semantics are implied at this point.

# Behaviour changes beyond the planned Qs

- `pattern:` on inline route parameters now lands on param.Pattern
  (the legacy parser silently dropped pattern: from inline params —
  it wasn't in the applyParamField switch).
- minLength/maxLength on array params now drop (with diagnostic) per
  OAS v2: those are string-typed constraints. Use minItems/maxItems
  for arrays. The legacy mis-applied them.
- Malformed-input handling: duplicate body/response tags and unknown
  tags now emit CodeInvalidAnnotation and drop the response, rather
  than failing the entire build with a hard error. The
  TestMalformed_DuplicateBodyTag / BadResponseTag tests updated to
  check successful Run + captured goldens.
- TestRoutesParser / TestRoutesParserBody now seed the responses map
  with the names the classification fixture references, since Q22
  requires resolvable refs. Some hardcoded validation-pointer
  assertions updated to nil for type-gated array params.

# Net code change

- Routes: -479 LOC (body_params.go, body_params_test.go,
  body_responses.go, body_responses_test.go, setters.go all retired).
- Routebody: +~370 LOC (4 files; clean tokenizer + diagnostics).
- Routes walker: +~200 LOC of orchestration replacing the deleted
  dispatchRouteKeyword body branches.
- Grammar synthetic factory: +30 LOC.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…r strip

`builders/routes/walker.go:trimCommentPrefix` was a routes-side
workaround for a coupling between `parsers.parsePathAnnotation` and
the grammar lexer's `stripComment`: parsePathAnnotation reshapes a
multi-line `/* ... */` block-comment route doc into per-line synthetic
`*ast.Comment` entries. After the swagger:route header line is
stripped, the per-line entries have NO `//` or `/*` prefix, so
`stripComment`'s `default:` branch fires — it returns the line
verbatim without running `trimContentPrefix`. Leading tabs / spaces
then survive into Title / Description; routes papered over it.

Now: parsePathAnnotation prepends `// ` to each synthetic per-line
comment (via a tiny `ensureCommentMarker` helper). Grammar's lexer
takes the `//` branch and runs `trimContentPrefix` (strips ` \t*/|`),
shedding leading whitespace exactly as it does for `//`-comment
routes. `trimCommentPrefix` and its godoc retire — the comment was
inaccurate anyway (it blamed grammar's lexer for the leak when the
actual cause was parsers' synthetic ast.Comment construction).

Two intentional behaviour shifts captured as Q23 in
.claude/plans/observed-quirks.md:

  - Markdown dash lists in route descriptions now survive. The
    legacy trimCommentPrefix stripped leading `-` too; the new
    trimContentPrefix path preserves them. Fixture:
    routes-description-dash-list.

  - `---` in route descriptions now triggers the YAML-fence absorber
    consistent with every other annotation. Subsequent prose
    (and any keyword blocks after it) is captured as YAML body and
    drops from the visible spec. The legacy stripped `---` to empty,
    masking this. Routes don't use YAML blocks, so this is a sharp
    edge — pathological authors only. Fixture:
    routes-description-yaml-fence-absorb.

Existing route goldens unchanged (no fixture in the suite carried a
`-` or `---` at line start in its description); the two new fixtures
are pure witnesses for the new contract.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…AsList

Replaces three ad-hoc list helpers (grammar.SplitCommaList,
yaml.ListBody, routes.bodyLines) with a single flex-list accessor
on grammar.Property. Block.GetList delegates to it. Every consumer
of list-shaped keywords (schemes / consumes / produces on routes,
spec/meta) now reads through the same seam, with the union of
accepted surface forms:

  Schemes: http, https            # inline, comma-separated
  Schemes:                        # multi-line, indented bare lines
    http
    https
  Schemes:                        # multi-line, YAML `- ` markers
    - http
    - https
  Schemes: http                   # inline + indented continuation
    - https

# Three latent issues fixed along the way

  - Inline raw-block values were silently lost. `Consumes:
    application/json` on a single line produced an empty Body
    because collectRawBlock skipped head.Text entirely (only
    collectRawValue had a single-line capture path). Prepending
    head.Text to bodyText restores parity.
  - Multi-line `Schemes:` was silently lost. KwSchemes was
    registered as asCommaList(), which the lexer doesn't expand
    into a body. Now asRawBlock(); combined with the inline-value
    capture above, every form works.
  - Different code paths per keyword maintained subtle inconsistencies
    (routes accepted comma for Schemes but not multi-line; meta
    silently lost inline Consumes; etc.). Eliminated by funnelling
    everything through Property.AsList.

# Algorithm (Property.AsList)

For each input line — Value (if non-empty) + each line of Body:
trim whitespace; drop a leading `- ` YAML marker if present; re-trim;
comma-split; trim each token; drop empties. Aggregate.

# Stops at simple token lists

Does NOT touch:
  - enum values (whose elements may be complex / JSON arrays);
  - the `+ name:` Parameters chunk grammar (routebody-owned);
  - YAML structural bodies (securityDefinitions, extensions,
    infoExtensions — those parse through yaml.TypedExtensions /
    json.Unmarshal because their structure isn't a simple list).

# Retired helpers

  - parsers/yaml/list.go (yaml.ListBody + stripListLeader) — file
    deleted.
  - grammar.SplitCommaList — deleted.
  - routes.bodyLines — deleted from routes/walker.go (still lives
    in spec/walker.go for KwTOS, which is a prose-join, not a
    list).

# Witness fixtures

  - routes-lists-flex-forms: four routes, each using a different
    surface form. All resolve to the same token-list shape.
  - meta-lists-flex-forms: swagger:meta with mixed forms for
    schemes/consumes/produces.

# Backward compat

Every existing route fixture's goldens unchanged — they all used
forms that already worked under one path or another. The two
witness fixtures capture the now-unified contract.

Logged as Q24 in .claude/plans/observed-quirks.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Pre-M6.5-E meta diverged from every other annotation on extension
handling:

  - Routes / operations: consume block.Extensions() (typed surface
    from grammar). Non-x-* keys silently dropped at parse time, no
    diagnostic.
  - Meta: dispatch KwExtensions / KwInfoExtensions through
    yamlparser.TypedExtensions(p.Body) + validateExtensionNames,
    hard-erroring out of codescan.Run on any non-x-* key.

That meant double YAML parsing (grammar + meta), an inconsistent
error contract across annotation families, and no diagnostic
signalling for typos in routes / operations.

Unified contract: grammar emits CodeInvalidAnnotation in
collectExtensionsFromBody when isExtensionName rejects a key, then
drops it. Extension.Source now carries the keyword name
("extensions" / "infoExtensions") so meta can route entries to
swspec.Extensions vs swspec.Info.Extensions; consumers that don't
care (routes, operations) just ignore it.

Retired in builders/spec/walker.go:
  - validateExtensionNames
  - ErrBadExtensionName
  - the two yamlparser.TypedExtensions calls in dispatchMetaYAMLBlock

dispatchMetaYAMLBlock now handles only KwSecurityDefinitions (the
one structural-YAML body left); extensions / infoExtensions flow
through a single block.Extensions() loop in applyMetaBlock.

stripPackagePrefix simplified in the same commit: replaced the
~35-LOC byte-loop with strings.CutPrefix + TrimSpace. Same
semantics: skip leading whitespace, match `Package `, drop the
identifier, return the tail. Rejects lowercase `package` so prose
like "package this carefully" doesn't get silently chopped (the
legacy accepted both; only the capital-P godoc convention is
honoured here).

TestMalformed_MetaBadExtensionKey and
TestMalformed_InfoBadExtensionKey switched from require.Error to
golden-comparison witnesses — they now expect Run to succeed with
the bad key absent from the emitted spec, matching the
diagnose-and-drop posture M6.5-C established for routes.

Logged as Q25 in .claude/plans/observed-quirks.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Caught during the M6.5 observed-quirks audit pass. The M6.5-C entry
in observed-quirks.md claimed Q21 was resolved (routebody's
description accumulator would strip leading whitespace), but the
re-captured goldens still showed `" OK"`, `" not found"`, etc. — the
legacy leading-space artifact was never actually stripped, and the
M6.5-PRE re-capture had locked the still-buggy state under the false
assumption it would land via routebody.

Cause: routebody/responses.go's `description:`-tag branch appended
the (empty) post-colon val from a token like `description:` before
the tail tokens, so `strings.Join(["", "OK"], " ")` produced `" OK"`.
The legacy parseTags had the same bug; routebody inherited it.

Fix: skip the empty val before appending to descTokens. The tail
tokens join with single spaces between non-empty entries; no leading
space.

Goldens shift for every response fixture using the
`description: X` form — primarily the route fixtures plus a handful
of pre-existing classification / petstore goldens that carry the same
shape on inline `description:` lines.

Audit-trail lesson: the witness-then-fix pattern relies on goldens
actually showing the post-fix shape. When the fix doesn't land but
the re-capture happens anyway, the harness silently locks a
half-state. The refresh audit is precisely where that drift surfaces.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Applies the M6.6 sweep recipe to builders/common: strip historical
narration, lift dense prose into a new package README, apply the M0
`# Details` pattern to the godoc that remains.

Specifically:

  - Dangling `Implementation notes (temporary)` block at the file
    tail (blockCache rationale + the
    .claude/plans/grammar/p7.1-... pointer) lifted to
    common/README.md §blockcache. Replaced inline with a
    `# Details` reference on `ParseBlocks`.

  - `MakeRef` godoc's "Same shape as ... used to have" historical
    justification rewritten as a present-tense statement of why
    the operation lives on the common base (§makeref).

  - `Diagnostics` parenthetical `(per plan §11 Q2 - "watch and
    improve when we start doing some serious work with
    diagnostics")` lifted to §diagnostics as the LSP-evolution
    caveat. The "Experimental" disclaimer folded into the same
    paragraph.

  - Two TODO(fred) markers (slog logger configurability; ireturn
    lint posture on ParseBlock) moved to §quirks-open as deferred
    follow-ups. The nolint directive itself retained — only the
    inline TODO archaeology moved.

  - Stray `// repo` comment removed.

  - Package godoc added — points readers at the README from the
    top of the file.

Code behaviour unchanged; only godoc + README.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
The godoc on GetEnumDesc pointed at `schema/extensions.go:clearStaleEnumDesc`
but the function actually lives in `handlers/dispatch_schema.go`. Update
the reference so the breadcrumb leads where the code does.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Apply the M0 documentation recipe to internal/scanner/ production
files: strip historical narration, drop deprecated grammar-evolution
references, and lift dense maintainer prose into a new
internal/scanner/README.md with anchored sections.

What moved where:

- options.go: stripped "Matches v1's strict default" historical
  narration on DescWithRef; the bare-$ref shape is now described as
  the current default with no time reference. Added a struct-level
  godoc on Options that points at §options / §descwithref /
  §diagnostics.
- scan_context.go: rephrased FileSet/OnDiagnostic/GetModel/
  FindModel/AddDiscoveredModel godocs to the M0 short-intent-plus-
  README-anchor shape. "previously-registered" on GetModel rewritten
  as a current-state description ("discovered decls already present
  in ExtraModels") and lifted into §model-lookup.
- index.go: detectNodes godoc condensed to one line plus a
  §classifier README anchor; the bitmask/exclusivity rules now live
  in the README.
- New README.md sections: §options, §descwithref, §diagnostics,
  §model-lookup, §classifier, §quirks-open — mirrors the M0 shape
  established by internal/builders/{common,schema}/README.md.

Code logic is unchanged — diff is godoc, comments, and the new
README only. Tests and golangci-lint --new-from-rev=HEAD pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Apply the M6.6 comment-summarisation sweep to internal/builders/handlers:

- Strip "v1 accepts any string", "v1 never propagated default/example
  parse errors", "responses' parity — v1 ..." narration from
  CollectionFormatString, Raw, and DispatchHeaderLevel0; rewrite as
  present-tense contract statements.
- Collapse the long package godoc into an intent-focused sentence plus
  a "# Details" pointer to the new README.
- Fix a stale code-pointer in keywords.go (parameters/walker.go →
  dispatch_simple.go) and replace the inline carve-out paragraph with
  a README anchor.
- Lift dense rationale (Raw errSink contract, simpleSchema allow-list
  invariants, stale x-go-enum-desc cleanup, SimpleSchema vs full-Schema
  dispatch split, vendor-extension routing, collectionFormat fallback)
  into a new internal/builders/handlers/README.md following the
  M0 lift shape.

Code logic unchanged — only godoc, package comments, and a new README.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Apply the M6.6 comment-summarisation sweep to internal/builders/operations:

- Strip the "v1 only ever consumed one fenced body per operation; we
  preserve that" narration from applyBlockToOperation; the
  single-fenced-body contract stands on its own as a present-tense
  statement.
- Collapse the dense applyBlockToOperation godoc to a one-sentence
  intent plus a "# Details" pointer to the new README §walker.
- Trim the Builder/setPathOperation godoc to intent-focused statements
  and route maintainers to README anchors for the slot-reuse and
  Build-orchestration prose.
- Lift the dense rationale into a new internal/builders/operations/README.md
  following the M0 lift shape, with sections for the builder, the
  path-item-slot reuse, the walker contract, and the open
  single-fenced-body quirk.

Code logic unchanged — only godoc, package comments, and a new README.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Strip historical-narration tics from schema-package godoc that
referenced "v1 behaviour", "parity with v1", "legacy" helpers, and an
internal plan section pointer. Comments now describe current-state
behaviour or lift the rationale (extension SkipExtensions scope,
items-chain ownership on named/alias arrays, $ref allOf compound
shape).

Logic-neutral — diff covers only doc-comment lines; tests pass and
golangci-lint reports zero issues.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Strip historical narration and internal plan pointers from the
validations package godoc and consolidate the rationale into a new
maintainer-notes README following the M0 §section pattern.

Code logic is unchanged — only comments, godoc, and the README.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Apply the M6.6 comment summarisation recipe to internal/parsers/yaml:
strip historical narration (round-1 / round-2, "legacy" qualifiers,
"grammar v1 does NOT import this package"), collapse the two
`.claude/plans/typed-extensions.md` references, and lift the
typed-extensions rationale (YAML → JSON normalisation, dedent
strategies, sibling-sub-parser seam) into a new
`internal/parsers/yaml/README.md` following the canonical M0
`# Details` template with TOC anchors.

Also fix a stale `Parse` godoc reference to a `pos` parameter that
the current signature does not carry.

Code logic unchanged. `go test ./...` green;
`golangci-lint run --new-from-rev=HEAD ./internal/parsers/yaml/...`
zero issues.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Sweep `internal/builders/routes/` godoc and the package README to
strip Q-number references, "legacy parser" / "matches v1 behaviour"
narration, and historical migration commentary. Rewrite each
affected comment as a present-tense statement of the current
contract; refactor the README to drop references to files that no
longer exist in the package and to land the M0 long-form pattern
(table of contents, table of files, per-section anchors).

No code logic changes — godoc and Markdown only.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…rchaeology

Move the long-form Parameters: / Responses: sub-language grammar from
doc.go into a maintainer README, leaving doc.go with a current-state
contract pointing at the README. Drop migration narration (Q-numbers,
"legacy parser" callouts, body_params.go line refs) from every file in
the package; rewrite the affected sentences as present-tense behaviour
statements.

Code logic is unchanged; this is a comment-only sweep.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…lan refs

Adds internal/parsers/grammar/README.md as the long-form companion to
the grammar package source; strips historical narration, plan-file
references (`.claude/plans/grammar/*.md`), and v1/round-1/grammar2
framing from godoc throughout the package. Code logic is unchanged —
every diff sits inside `//` comments.

README sections created:
- §overview / §pipeline — what the package parses, three-stage flow
- §preprocess-contract — comment-marker stripping rules
- §lexer-contract — line classifier, body accumulator, prose classifier;
  swagger-prefix and godoc-route exception; directives; items-prefix;
  trailing-dot elision; first-character case-insensitivity
- §prose-classification — TITLE/DESC split heuristics (lifted from
  the dense classifyProse / classifyProseRunsInPlace / classifyTitleDescRun
  godoc)
- §raw-block-terminators — the sibling-terminator rule, inline-value
  capture, per-body indentation handling (lifted from collectRawBlock /
  isSiblingTerminatorFor)
- §yaml-fence-handling — opaque YAML bodies and decorative fences
- §disambiguation — default / enum-args / type-ref / HTTP-method
  value-shape dispatch
- §parser-contract / §block-shapes — family dispatch table, typed
  Block kinds and their fields (lifted from Block interface godoc,
  ParseAll godoc, parseState godoc)
- §property-shape — Property / TypedValue / IsTyped / AsList (lifted
  from Property godoc and AsList's surface-form table)
- §walker-contract — full Walker dispatch table, FilterDepth rules,
  concurrency contract (lifted from Walker godoc — a 60-line block
  collapsed to one sentence + Details pointer)
- §keyword-table / §context-legality — closed-vocab keyword
  classification and per-annotation context legality
- §annotation-args — per-annotation argument terminals and validation
- §typed-extensions / §security-requirements / §contact-license —
  body→typed accessors
- §diagnostics — Code / Severity model and the full code list
- §synthetic-block — NewSyntheticBlock factory
- §quirks-open — deferred follow-ups (body-shape, position fidelity,
  closed-vocab prefix)

Files touched: doc.go, ast.go, lexer.go, parser.go, walker.go,
keywords.go, diagnostic.go, annotations.go, disambiguate.go,
preprocess.go, token.go, synthetic.go.

All tests green; golangci-lint --new-from-rev=HEAD reports zero
new issues.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Replaces the stale auto-generated docs/annotation-keywords.md with
four hand-authored topic-split docs under ./docs/. Each carries
minimal Hugo frontmatter (title + weight); author-leaning content
shows first (lower weights), implementer content later.

  - annotations.md (weight 10) — author-first cheatsheet of every
    swagger:* annotation with placement rules, argument shapes,
    code samples, and a final annotation x keyword compatibility
    matrix.
  - keywords.md (weight 20) — keyword reference card grouped by
    family (numeric / length / format / schema decorators /
    parameter location / meta single-line / body). Per-keyword
    contracts with samples drawn from real fixtures.
  - sub-languages.md (weight 30) — the embedded languages: prose
    classification, flex-list reader, Parameters / Responses body
    grammars, YAML extensions, security requirements, contact /
    license inline forms.
  - grammar.md (weight 40) — formal EBNF lifted from the internal
    plan and actualised against current code. Lexer terminal
    vocabulary, top-level dispatch, per-family productions
    (schema / operation / meta / classifier), cross-cutting
    productions, walker contract, diagnostics model. Closing
    section on what the grammar deliberately does NOT describe
    (coercion, $ref resolution, etc. — analyzer territory).

Each doc cross-links to the others with relative paths. Code
samples are hybrid: inline snippet (so the doc reads end-to-end)
plus a one-line fixture path for the full executable example.

Per the M6.6 "one grammar only" rule, no traces of "grammar v1",
"grammar2", "v2 grammar", "the new grammar parser", "post-Stream-M",
or "round-1/round-2 typed-extensions" appear in the docs — the
published surface presents a single grammar that always existed in
its current form. The branch's multi-version refactor history stays
in .claude/plans/ (gitignored) and git log; the docs read
forward-only.

The old generator is not reintroduced. docs/annotation-keywords.md
(carried a STALE banner pointing at the deleted generator code)
is removed.

The doc is v0 by intent. Some users will benefit from expanded
keyword samples and more worked end-to-end examples in a future
pass. The current shape is "ideal for maintainers, okay for users."

The eventual Hugo-published site will pick up the frontmatter
weights to order the navigation. A future doc generator (see the
testify codegen reference in the post-merge backlog) could augment
the hand-authored content; that's a separate project.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant