Feat/new parsing layer#20
Open
fredbi wants to merge 142 commits into
Open
Conversation
❌ 1 Tests Failed:
View the top 1 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
Place empty package files under internal/parsers/grammar/ as the landing zone for P1 work: preprocess.go, lexer.go, parser.go, ast.go, diagnostic.go, style.go. Each file carries a TODO pointer to the P1 task. No behavior; go build ./... remains clean. See .claude/plans/grammar-parser-architecture.md and .claude/plans/grammar-parser-tasks.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add keywords.go (types + functional-options constructor + Lookup/Keywords
accessors) and keywords_table.go (the authored []Keyword data: 35 entries
covering validations, property flags, meta single-line, block headers,
plus W5 externalDocs).
Design choices, per architecture §3.4 / §2.2.1 and tasks P0.2:
- Kind enum lists the 8 sub-contexts where keywords may appear
(Param, Header, Schema, Items, Route, Operation, Meta, Response).
- ValueType covers the primitive-typed values the parser will
convert in-line (Number/Integer/Boolean/StringEnum) plus the
deferred categories (String verbatim, CommaList, RawBlock, RawValue).
- Option A for docs: each keyword carries per-context doc strings
(inParam("…"), inSchema("…"), …) so LSP can show tooltips that
match where the cursor sits.
- W5 opportunistic: externalDocs entry landed alongside v1 keywords.
- W7 opportunistic: per-keyword legal-contexts list is exactly the
seed data LSP completion will consult; seeded from observed v1
behavior (regexprs.go + tagger trees).
Drop inResponse() and doc() options — no v1 keyword uses them; we
re-add if W6/W2 surface the need.
Add keywords_test.go covering Lookup (canonical, alias, case/space
normalization, unknown) and a shape invariant (every keyword has a
name, ≥1 context, and StringEnum implies Values).
Also add missing SPDX headers to the P0.1 placeholder files.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add Severity (Error/Warning/Hint), Code (dotted stable identifier), and
Diagnostic{Pos, Severity, Code, Message}. Codes prefixed "parse." for
the grammar layer; sub-parser subpackages use their own prefix so codes
stay globally unique.
Pre-declare 10 codes the parser and its analyzers will emit in P1/P2
(invalid-number/integer/boolean/string-enum, unknown-keyword,
context-invalid, invalid-extension-name, unterminated-yaml,
invalid-annotation, malformed-line). The list grows as sites surface.
Expose Errorf/Warnf/Hintf constructors (formatted Message) and a
compiler-style Diagnostic.String() rendering so editor jump-to-line
tooling can consume the output directly.
Tests cover Severity.String, each constructor, the render format, and
the empty-position fallback.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add internal/parsers/grammar/gen — a small command that reads the
authoritative Keywords() slice at `go generate` time and renders it to
docs/annotation-keywords.md (summary table + per-keyword details with
aliases, value type, legal contexts, per-context docs).
- //go:generate directive lives in keywords_table.go, runs
`go run ./gen -out ../../../docs/annotation-keywords.md`.
- Output is deterministic (same input -> byte-identical file), which
P0.5 will enforce in CI.
- Named constants for exit codes and perms avoid mnd lint flags; no
nolint directives needed.
Supporting additions:
- Kind.String() and ValueType.String() — labels used by the generator
(and later by P1.7 context-invalid diagnostic messages).
- `exhaustive` lint satisfied by explicit `KindUnknown`/`ValueNone`
cases with `fallthrough` to default.
Generated docs/annotation-keywords.md covers the 34 v1-parity keywords,
rendered at 326 lines.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Expose the generator's Render() function and add gen_test.go with TestGeneratedDocIsCurrent: reads the committed docs/annotation-keywords.md and compares it byte-for-byte against a fresh render of the current keyword table. If the table changes and the doc isn't regenerated, CI fails with an actionable message telling the developer to run go generate and commit. Rationale for putting the check in a _test.go rather than a dedicated workflow: this repo delegates CI plumbing to go-openapi/ci-workflows' shared workflows. Adding a bespoke workflow just for this check would break that pattern for a one-line assertion. A unit test runs as part of the existing go-test job at zero ceremony. Completes P0. Next: P1 core parser pipeline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add Preprocess(cg, fset) -> []Line{Text, Pos}. Handles both // and
/* */ forms, including multi-line blocks with continuation asterisks.
Leading godoc decoration (whitespace, *, /, -, optional markdown
table pipe) is stripped via trimContentPrefix, mirroring the v1
rxUncommentHeaders regex so fixtures stay parity-compatible at the
parse-output level.
Position tracking: each Line carries the token.Position of the first
character of Text. Continuation lines inside a /* ... */ block report
Column=1 for simplicity; precise column reconstruction would require
re-tokenising and is deferred until LSP needs it.
Fence-body indentation is not preserved here — fence state lives at
the lexer layer (P1.2), so the preprocessor stays stateless and
position-only. Documented in the godoc.
Tests cover:
- nil CommentGroup / FileSet returns nil
- single-line and multi-line // comments
- /* ... */ blocks with leading '*' decorations
- markdown table-pipe stripping (and whitespace after the pipe)
- embedded whitespace preserved inside Text
- multiple *ast.Comment entries in one CommentGroup
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Tighten stripComment so Line.Pos.Line, .Column, and .Offset are all
accurate on every emitted line, including continuation lines inside
/* ... */ blocks. Previously the column was approximated to 1 on
block-comment continuation lines, which would have forced a
retrofit when LSP consumed Line.Pos.
Factor the per-line math into stripLine(s, pos) which advances pos
by the number of bytes trimContentPrefix consumed. Block-comment
paths compute each line's starting Offset by tracking the byte index
of the current line within the comment body and adding the "/*"
marker length.
Add column-precision tests:
- // line comment: `foo` sits at Column=4 ("//", space, f)
- /* block comment with " * prefix": content after " * " at col 4
- indented block with tab continuation: content at col 4
- offset monotonicity across a multi-line // group
Minor: extract `wantModelFoo` const in the test file to satisfy
goconst on the now-reused fixture string.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add TokenKind enum (EOF, BLANK, TEXT, ANNOTATION, KEYWORD_VALUE,
KEYWORD_BLOCK_HEAD, YAML_FENCE) and Token struct with per-kind fields
(Text, Value, Keyword, Args, ItemsDepth). Lex() emits one token per
preprocessed line plus a trailing TokenEOF.
Classification rules (per line):
- empty or whitespace-only -> BLANK
- "---" (trim-equal) -> YAML_FENCE
- starts with "swagger:" -> ANNOTATION (Text=name, Args=rest)
- "[items.]*<keyword>: <value>" -> KEYWORD_VALUE (Value populated)
- "[items.]*<keyword>:" -> KEYWORD_BLOCK_HEAD
- otherwise -> TEXT
Keyword recognition goes through grammar.Lookup, which already
handles case-insensitivity and aliases — canonical name is written
into Token.Text and the matching *Keyword is attached.
stripItemsPrefix mirrors rxItemsPrefixFmt: `(?:[Ii]tems[.\s]*)+`.
It does NOT overeat: "maxItems" stays a single keyword (prefix check
is anchored at position 0 of the token text, not sub-matched); and
bare "items:" stays a non-keyword TEXT since nothing in the table
matches "items" alone.
Position tracking: Token.Pos is advanced past any stripped items.
prefix so it points at the keyword's first character.
Godoc-identifier-prefix form ("DoFoo swagger:route ...") is NOT
handled in the lexer — deferred to P1.4 where the parser orchestrates
annotation discovery and can decide case-by-case.
Tests cover each token kind, each items-prefix depth variant, the
"maxItems must not overeat" edge, canonical-name resolution from
aliases (MAX -> maximum, max-length -> maxLength), malformed
"swagger:" falling back to TEXT, and Pos advancement after items.
prefix stripping.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add Block interface and the seven typed kinds the parser will
dispatch to (architecture §4.6):
- ModelBlock (swagger:model)
- RouteBlock (swagger:route)
- OperationBlock (swagger:operation)
- ParametersBlock (swagger:parameters)
- ResponseBlock (swagger:response)
- MetaBlock (swagger:meta)
- UnboundBlock (no annotation — struct field docstrings)
Block interface: Pos, Title, Description, Diagnostics, AnnotationKind,
plus iter.Seq-based iterators Properties/YAMLBlocks/Extensions. Using
iter.Seq (Go 1.23+, module targets 1.25) per §4.2: iterator form, not
Accept/Visit callbacks.
Support types:
- Property {Keyword, Pos, Value, Typed, ItemsDepth}
- TypedValue {Type, Number, Integer, Boolean, String}
- RawYAML {Pos, Text} (captured --- body; not parsed here)
- Extension {Name, Pos, Value}
baseBlock (unexported, pointer-embedded) holds the shared state;
typed blocks embed it and add kind-specific positional fields. Exported
methods come through the embedding — external callers see the
interface surface and the kind-specific fields, nothing else.
AnnotationKind enum with String() and AnnotationKindFromName(name).
Labels factored into const block (labelRoute, labelModel, …) so the
same literal appears once — parser (P1.4) and analyzer will use the
same names for diagnostics.
Tests cover interface satisfaction (compile-time assertions), label
round-trip, full baseBlock accessor surface via a ModelBlock, and
iterator early-break semantics.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Wire the preprocess → lex → parse pipeline: Parse(cg, fset) returns a
typed Block from a Go ast.CommentGroup. ParseTokens(tokens) is the
same without the preprocessor, for LSP scenarios where token streams
are synthesized.
Algorithm:
1. Scan tokens for the first ANNOTATION (or none).
2. Build typed Block via buildTypedBlock dispatch:
swagger:model -> ModelBlock{Name}
swagger:response -> ResponseBlock{Name}
swagger:parameters -> ParametersBlock{TargetTypes}
swagger:meta -> MetaBlock
swagger:route -> RouteBlock{Method,Path,Tags,OpID}
swagger:operation -> OperationBlock{Method,Path,Tags,OpID}
anything else -> UnboundBlock carrying the kind
3. parseTitleDesc on tokens before the annotation (first paragraph
is title, rest joined as description).
4. parseBody on tokens after the annotation: KEYWORD_VALUE and
KEYWORD_BLOCK_HEAD -> Property; YAML_FENCE pairs -> RawYAML body
captured via reconstructLine() best-effort.
5. Never panic: unknown tokens are skipped; unmatched YAML fence
emits CodeUnterminatedYAML diagnostic but still captures body.
Positional args for route/operation (P1.6 scope) are extracted here
already since the annotation token already carries them. Malformed
(<3 args) emits CodeInvalidAnnotation.
Bug fix in the preprocessor surfaced by the YAML fence tests: the
previous trimContentPrefix stripped leading `-`, which also ate the
`---` fence marker. `-` removed from the strip set. Bullet-list
dashes in description now survive to Text (arguably more correct
than v1's silent strip — flagged in the godoc).
Also renamed the internal `parser` type to `parseState` to avoid a
name clash with the go/parser package the tests import.
Tests (parser_test.go) cover:
- ModelBlock / RouteBlock / ParametersBlock / UnboundBlock dispatch
- route malformed -> CodeInvalidAnnotation diagnostic
- nil CommentGroup -> empty UnboundBlock
- title + description extraction (godoc-style ordering)
- properties in order + item-depth preservation
- block-head property (consumes:)
- balanced YAML fence -> body captured
- unterminated YAML fence -> diagnostic + body captured to EOF
- "don't panic on anything weird" sweep
Known gap (P2.1): YAML bodies are reconstructed from already-
classified tokens, so indentation and exact punctuation are lost.
Good enough for kind/content assertions; not yet suitable for YAML
re-parsing. P2.1 will add fence-state tracking so raw bytes survive.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Populate Property.Typed for the four parse-time-convertible ValueTypes
(Number, Integer, Boolean, StringEnum); leave zero for the deferred
ones (String verbatim, CommaList, RawValue, RawBlock). Conversion
failures emit non-fatal diagnostics with the appropriate CodeInvalid*
code; Typed.Type stays ValueNone so consumers can distinguish "no
conversion performed" from "zero value successfully parsed".
Per-type rules:
- Number: strconv.ParseFloat; accepts v1's leading comparison
operator (<, <=, >, >=, =) which is captured in TypedValue.Op
for the analyzer to use for exclusiveMaximum / exclusiveMinimum.
- Integer: strconv.ParseInt base 10, rejects fractions.
- Boolean: strict "true"/"false" case-insensitive (stdlib
ParseBool is too lenient — it accepts "1"/"t"/"T"/"TRUE" etc.,
which v1 rejects).
- StringEnum: case-insensitive match against Keyword.Value.Values,
canonicalised to the table spelling.
Adds TypedValue.Op to ast.go for the operator prefix.
Tests (typeconv_test.go) cover valid conversion for each type, each
operator variant, case-insensitive boolean + enum, the stdlib-lenient
"1" rejection, fraction-rejection for integers, and non-primitive
value types staying at zero TypedValue.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
The v1 rxRoutePrefix allows one leading godoc identifier before
swagger:route, e.g.:
// ListPets swagger:route GET /pets tags listPets
matchGodocRoutePrefix scans a leading identifier (Unicode letter +
letters/digits/'_'/'-'), whitespace, then the literal "swagger:route"
terminated by whitespace or EOL. On match the lexer advances past the
prefix and feeds the rest to lexAnnotation, producing a normal
TokenAnnotation with Pos pointing at 's' of swagger:.
The exception is narrow by design:
- Only "swagger:route" — other annotations (model, operation,
parameters, …) keep the "must start the line" rule.
- "swagger:routex" does NOT match (guarded).
- Multi-word prefixes ("Do Foo swagger:route") do NOT match.
Positional Method/Path/Tags/OpID extraction was already in place
from P1.4 (fillOperationArgs). This commit just feeds the godoc-
prefixed form into the same path.
Tests cover: lexer-level classification (route-only exception,
'swagger:routex' rejection, multi-word-prefix rejection, Pos
advance past prefix), and end-to-end parser production of a proper
RouteBlock with zero diagnostics.
Also adds a nolint comment on Kind.String() for "route"/"operation"/
"meta"/"response" labels — goconst wants to share labelXxx from
ast.go but Kind (keyword context) and AnnotationKind (Block
dispatch) are intentionally separate concerns (architecture §4.6).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add checkContextValidity(base) to the parser's post-body pass. For
each Property, check whether the Keyword's Contexts list intersects
the allowed set for the block's AnnotationKind; if not, emit
CodeContextInvalid as SeverityWarning (non-fatal).
Mapping (AnnotationKind -> allowed Kind union):
- AnnModel -> {Schema, Items}
- AnnParameters -> {Param, Schema, Items}
- AnnResponse -> {Response, Schema, Header, Items}
- AnnOperation -> {Operation, Param, Schema, Header, Items, Response}
- AnnRoute -> {Route, Param, Schema, Header, Items, Response}
- AnnMeta -> {Meta, Schema}
- everything else -> nil (skip check)
The sets are deliberately permissive: an operation body can host
schema properties, response headers, parameters, etc. Analyzers with
more context (Go type, enclosing struct) can enforce tighter rules.
UnboundBlock skips the check — its target context is determined by
the scanner from the enclosing declaration.
Diagnostic format:
keyword "in" not valid under swagger:model (legal in: param)
Tests (context_test.go) cover:
- legal keyword -> no diagnostic
- illegal keyword -> exactly one warning, message mentions the
keyword and its legal contexts
- multiple illegal keywords -> one diagnostic each
- keywords legal under multiple annotations (consumes: under Meta
/ Route / Operation) -> zero diagnostics
- UnboundBlock skips the check
- severity is Warning, never Error; Block is still produced
Cleanups shaken out by the new checks:
- contextsOverlap uses slices.Contains
- Kind.String() drops its now-unused nolint (goconst no longer
triggers)
- lexer_test.go references labelRoute instead of the literal "route"
- context_test.go helper hardcodes CodeContextInvalid (unparam)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add productions_test.go with one focused test per §2.1 envelope
production:
- annotation-only (annotation w/o body)
- title-only (single paragraph, no description)
- multi-paragraph desc (verifies \n\n join)
- properties-no-title (annotation followed by keywords)
- properties-interleaved (TEXT between properties is dropped)
- block-head property (consumes: value-less)
- empty YAML body (--- immediately followed by ---)
- multiple YAML blocks (two independent fenced sections)
- full-envelope order (title → desc → props → yaml composed)
Plus coverage-fill:
- Exhaustive String() on Kind, ValueType, TokenKind (previously
only spot-tested).
- Remaining AnnotationKind dispatch (ResponseBlock, MetaBlock,
UnboundBlock from strfmt / alias / allOf / enum / ignore / file).
Coverage on internal/parsers/grammar/ rises from 85.1% to 93.3% —
above the ≥90% exit criterion for P1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Two catch-up items from the P1 flag queue.
1. Verbatim YAML body contract.
- Add Line.Raw field — content with Go comment markers stripped
and one layer of godoc decoration removed, but all YAML
indentation preserved.
- Per comment kind:
`// foo` -> Raw = "foo" (one godoc space stripped)
`// 200: ok` -> Raw = " 200: ok" (2-space indent kept)
`/* * bar */` (cont.) -> Raw = "bar" (block continuation stripped)
`/* ... */` no-cont. -> Raw = full line (indentation kept)
- Preprocessor threads a per-comment-kind rawStrip strategy into
stripLine (stripSingleGodocSpace vs stripBlockContinuation).
- Lexer now tracks one bit of state (inFence) and emits a new
TokenRawLine kind carrying Line.Raw verbatim for every line
between matched `---` fences.
- Parser's collectYAMLBody simplifies to `post[i].Text`; the
best-effort reconstructLine() from P1.4 is deleted outright.
- `TestParseYAMLFenceBalanced` updated (via new production tests)
to assert indentation is preserved; body lines come through
ready for internal/parsers/yaml/ to consume.
2. Preprocessor `-` stripping — v2 divergence lock-in.
- Decision: keep leading `-` in Line.Text (don't restore v1's
silent strip). This preserves bullet-list semantics ("- foo"
stays "- foo", not "foo") and keeps the `---` YAML fence
marker detectable without special-casing.
- trimContentPrefix already dropped `-` from the strip set in
P1.4; this commit adds explicit tests that lock in the
behavior (TestP110DashNotStrippedInProse + preprocessor-level
TestP110DashPreservedOnlySurvivesVerbatimInText).
- The parity harness in P4 will verify no real fixture depended
on the old behavior; if one does, it's an explicit migration
decision then, not a silent deferral.
Coverage on internal/parsers/grammar/ lands at 94.5%. P1.10 queue is
empty — no pending flags roll over into P4.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Extend Property with `Body []string`, populated for KEYWORD_BLOCK_HEAD tokens (consumes:, produces:, security:, responses:, parameters:, extensions:, infoExtensions:, tos:, securityDefinitions:, externalDocs:). Non-block-head properties keep Body nil. collectBlockBody(base, post, i): after emitting the block-head property, consume subsequent TEXT tokens into prop.Body. Collection stops at the next structured token (KEYWORD_*, ANNOTATION, YAML_FENCE, RAW_LINE, EOF) per legacy stop point S6 from the implied-stop appendix. Interior blank tokens are deferred and re-emitted as empty body lines only if more text follows — trailing blanks are dropped. Block.Properties() iteration order is unchanged; each block-head Property now carries its body inline alongside the keyword metadata. Analyzers (P5 bridge taggers) read prop.Body directly; per-keyword tokenization (MIME types for consumes/produces, security mappings, etc.) is their concern. Tests (blockbody_test.go): single-block capture, stop-at-next-keyword, stop-at-annotation, stop-at-YAML-fence, trailing-blanks-trimmed, non-block-keyword-unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
When collectBlockBody encounters an `extensions:` or `infoExtensions:`
block head, each body TEXT token is also parsed into an Extension
{Name, Value, Pos} and appended to the Block's extensions slice.
Property.Body is still populated with the raw lines for analyzers
that prefer verbatim input.
parseExtensionLine splits "name: value" per-line; whitespace is
trimmed, blank-or-malformed lines are skipped. Name validation (the
`x-*` requirement) is deferred to P2.4.
Block.Extensions() returns the flat iterator; callers use it
uniformly regardless of whether the source used `extensions:` or
`infoExtensions:` (or, later, extension blocks that appear inline
inside other contexts).
Tests (extensions_test.go): basic extraction, infoExtensions path,
per-line source positions, parallel Body+Extensions survival,
scoping (consumes: body not scraped), malformed-line skipping.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add isExtensionName(s) — a well-formedness check mirroring the v1
rxAllowedExtensions pattern `^[Xx]-` (case-tolerant prefix, at least
one suffix character required). When collectBlockBody is in an
extensions block and extracts a name that fails the check, emit a
SeverityWarning CodeInvalidExtension diagnostic at the offending
line's Pos.
Non-fatal by design:
- The Extension is still appended to Block.Extensions() so
analyzers / LSP can decide policy (surface to user, drop, etc.).
- Pairs with the broader "diagnostics accumulate, don't throw"
principle (architecture §4.3, tasks P1.4).
Tests: `TestExtensionsInvalidNameDiagnostic` (invalid name emits
warning; extension still collected) and
`TestExtensionsAcceptsUppercaseX` (X- prefix accepted per v1
behavior).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Create internal/parsers/yaml/ — a thin wrapper around
go.yaml.in/yaml/v3 for parsing the RawYAML bodies isolated by
internal/parsers/grammar/. The grammar parser stays YAML-free
(verified: `go list -f '{{.Imports}}' ./internal/parsers/grammar/`
has no yaml entry) per architecture §3.3, §5.1.
Exposed surface:
- yaml.Parse(body) -> (any, error):
Unmarshal into a generic value (map/slice/scalar). Returns
(nil, nil) for an empty body so callers don't branch on
error-vs-nil for the "fence with no content" case.
- yaml.ParseInto(body, dst) -> error:
Unmarshal into a caller-defined struct. Empty body is a
no-op, leaving dst at its zero value.
Both wrap the underlying error with a "yaml:" prefix so downstream
diagnostics can distinguish YAML parsing failures from other errors
without type-asserting.
Tests: empty body, flat map, nested structure with non-string keys
(map[any]any as YAML v3 returns), invalid YAML error wrapping,
struct unmarshal, empty+struct no-op.
Pattern: this subpackage establishes the seam for any future
sub-language (enum variants per W2, richer example syntax per W3,
private-comment bodies per W4). Each gets its own
internal/parsers/<name>/ package; the grammar parser never
imports any of them.
Completes P2. P1 and P2 are both fully green; ready for P3.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Expose the parser as a three-method interface — Parse, ParseText,
ParseAs — behind NewParser(fset). The interface is the injection
seam the P5 bridge-taggers and property-based builder tests need
(architecture §5.3): tests construct Block values directly and
feed them through a mock Parser without re-lexing text.
- Parse(cg) primary path — preprocess -> lex -> parse
- ParseText(t,p) LSP / test path — raw text with position
- ParseAs(k,t,p) LSP kind-hint path (§4.6) — prepends a
synthetic swagger:<kind> so dispatch goes the
right way even when the editor hasn't typed
the annotation line yet.
preprocessText is a small helper that turns raw lines into Line
values carrying both Text and Raw (identical — no Go comment
markers to strip) with monotonic positions from basePos.
Backward compat: the existing package-level Parse(cg, fset) is now
a convenience wrapper around NewParser(fset).Parse(cg). All
existing tests pass unmodified.
Tests: interface satisfaction (compile-time), each method exercised
end-to-end, ParseAs forces dispatch on a property-only body, the
package-level Parse wrapper still works.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add the Option functional-options layer on NewParser:
type Option func(*parserImpl)
func NewParser(fset, opts ...Option) Parser
Ship WithDiagnosticSink(cb) as the first concrete Option — invokes
the callback for every Diagnostic the parser emits, in parallel with
the block-local accumulation that already exists. The LSP seam
(architecture §4.3): diagnostics must be surfaced as they're
produced, not batched until parse completes.
Under the hood, parseState gains a `sink func(Diagnostic)` field and
an `emit(d)` helper that fans out to sink + local slice. All nine
`p.diag = append(p.diag, …)` sites converted to `p.emit(…)`. The
package-level `ParseTokens` path keeps a nil sink (no callback) —
only parserImpl constructs a parseState with the Option's sink.
Logger option intentionally not added yet — parsers with no trace
output haven't needed one; when an LSP or CLI developer wants
verbose output we'll add it then. Option type is variadic so the
addition won't break callers.
funcorder lint honored: ParseAs moved above runParser so exported
methods cluster before unexported ones in parserImpl.
Tests: sink receives every diagnostic (matches Block.Diagnostics()
count), default behavior (no option) unchanged, codes in stream
match expected (CodeContextInvalid + CodeInvalidNumber from a
deliberately wonky source).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add six lookup methods to Block (and baseBlock):
Has(name) bool
GetFloat(name) (float64, bool)
GetInt(name) (int64, bool)
GetBool(name) (bool, bool)
GetString(name) (string, bool)
GetList(name) ([]string, bool)
All share a findProperty helper that matches canonical-name OR alias,
case-insensitively. `Max`, `max`, `MAXIMUM` all resolve to the same
`maximum` Property.
Typed getters return (zero, false) when the keyword is absent or its
ValueType doesn't match — so callers write:
if n, ok := block.GetFloat("maximum"); ok {
schema.Maximum = n
}
GetString is the permissive fallback: StringEnum returns the
canonical (table-spelled) value; everything else returns the raw
Property.Value. GetList unifies the two shapes of "a list of things":
a block-head Property's Body (consumes:, security:, …) is returned
directly; a ValueCommaList value (enum, schemes, …) is split on
commas with whitespace trimmed. GetList returns a defensive copy so
mutating the returned slice can't corrupt Block state.
Block interface grows from 9 to 14 methods. interfacebloat lint
silenced with a rationale: Block is the single consumer contract
for both builders and LSP; splitting into BlockInfo / BlockIterators
/ BlockAccessors would introduce friction at every call site and
gain nothing — there's no implementation other than baseBlock-
embedded typed kinds.
Coverage on internal/parsers/grammar/ holds at 93.8%. Full repo
test suite green.
Tests (accessors_test.go): Has + absent, alias + case-insensitive
lookup, GetFloat/Int/Bool/String happy paths, StringEnum
canonicalization, GetList for CommaList and Body shapes, defensive-
copy behavior, type-mismatch returning false.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Per Fred's review: the Parser interface exists to enable mock
implementations in tests, not to support multiple production
variants. NewParser should return the concrete type so IDE
discoverability and docs-on-hover work; the interface stays as the
mock contract.
Changes:
- Rename parserImpl -> DefaultParser (exported).
- NewParser(fset, opts...) now returns *DefaultParser instead of
Parser. Callers who previously wrote `var p Parser =
NewParser(...)` continue to work via implicit satisfaction.
- Drop the ireturn nolint on NewParser (no longer applies).
- Parser interface godoc reframed: "consumer contract / mock seam".
- Compile-time assertion in parser_api_test.go updated to
`(*DefaultParser)(nil)`.
Per-method ireturn nolints on Parse/ParseText/ParseAs stay — those
still return Block (the AST's polymorphic family).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…mentView
Create internal/parsers/grammar/grammar_test/ (package grammartest)
with the parity-harness plumbing the P5 bridge-tagger migration
depends on:
- NormalizedCommentView — parser-agnostic, diff-friendly shape
capturing AnnotationKind + kind-specific positional args (Name,
Method/Path/Tags/OpID, TargetTypes), Title/Description, typed
Properties (value + Typed subtree), YAML bodies, Extensions, and
Diagnostics sorted by (code, severity) for determinism.
- ViewFromBlock(b grammar.Block) NormalizedCommentView — the v2
adapter. Uses a type-switch over the block family that is
explicit per kind (future AnnotationKind additions fail closed —
they just don't populate args).
- ParseSourceToViews(t, src) — test helper: parse a Go snippet,
walk its declarations, normalize each attached comment group.
- AssertGoldenView(t, path, views) — JSON-snapshot diff driver
honoring UPDATE_GOLDEN=1 (matches the existing scantest convention).
Seven committed golden fixtures cover the common comment-group shapes:
simple_model, route_with_tags, operation_with_yaml,
parameters_with_validations, meta_with_extensions, unbound_with_bullet,
and context_invalid_diag. These lock in the v2 parser's output; P5
commits will extend the set per builder.
Bug caught by the harness and fixed inline: UnboundBlock previously
dropped all prose because the parser routed the entire token stream
into parseBody. Added findBodyStart(tokens) — for an annotation-less
comment group, tokens are split at the first keyword/YAML-fence so
the prose prelude (e.g. a struct-field docstring) is recovered into
Title/Description. This is the canonical use case for UnboundBlock
and would have been a recurring bug in P5 without the harness catch.
P4.3 (Options.UseGrammarParser feature flag + scanner wiring) is
deferred to the first P5 bridge-tagger commit, where it has an
actual consumer — pre-P5 the flag has no path to exit through.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add fixtures/enhancements/enum-overrides/ and TestCoverage_EnumOverrides
to pin the v1 behavior for five enum cases that W2 needs to answer
before P5.1:
A. swagger:enum + matching consts -> const inference
B. inline comma-list only, no consts -> inline
C. inline JSON-array only, no consts -> inline
D. swagger:enum with NO matching consts -> empty schema
E. swagger:enum + matching consts + inline -> inline WINS
The golden snapshot confirms Fred's proposed override semantics hold
in v1 today: case E renders enum=["urgent","normal"] from the inline
annotation even though PriorityE has three const values ("low",
"medium", "high"). So the v2 parser migration inherits this rule
rather than diverging.
Things the golden also surfaces (non-parity items, captured here so
they aren't re-discovered during P5):
- comma-list splitter preserves leading whitespace: B renders
["low"," medium"," high"] with literal spaces. A v1 quirk; P5
bridge-tagger should strip per-value whitespace (or the new
internal/parsers/enum/ sub-parser should).
- case D emits a property with NO type and NO enum. The
swagger:enum annotation is silently ignored when no consts
match. P5 should surface a diagnostic ("swagger:enum
TypeName resolved to zero values").
- case E retains x-go-enum-desc describing the const values even
though the inline override wins. Stale vendor-extension; P5
should drop x-go-enum-desc when the inline override takes
precedence.
This golden is the factual v1 reference. Any v2 divergence during
P5.1 will be explicit (new golden, documented rationale).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Windows CI fails on the new grammar_test AssertGoldenView harness
because git's default core.autocrlf=true converts line endings on
checkout. The existing internal/scantest harness sidesteps this by
comparing JSON semantically (assert.JSONEqT), but AssertGoldenView
does a byte-equal compare — exact by design so it catches
field-order regressions and trailing-newline changes. Byte-equal +
platform-dependent checkout = broken on Windows.
Two narrow fixes:
- Normalise CRLF -> LF on the read side only (bytes.ReplaceAll).
`got` is always freshly produced by json.MarshalIndent + a
trailing '\n', so it's LF-only; normalising `want` after
ReadFile makes the compare platform-independent without
loosening the "exact bytes expected" invariant.
- Add .gitattributes pinning *.json and both golden directories to
`eol=lf`. Belt-and-suspenders: prevents autocrlf from
corrupting goldens for any other tool that does byte-level
inspection (e.g., external diffs, editor tooling).
Both fixes together mean a fresh Windows checkout produces LF-only
golden files AND the harness tolerates CRLF-infected older
checkouts until they're refreshed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add the feature flag the P5 migration uses to coexist the legacy
regex-based tagger pipeline and the new grammar-parser +
bridge-tagger pipeline. Default false: no behavior change yet;
bridge-tagger consumers land in subsequent P5.x commits.
The flag is plumbing only at this step. Its roles unlock across
the upcoming P5 work:
- Routing seam in the scanner/builders (step 4): when true, the
comment-group dispatch goes through grammar.Parser; when false,
the legacy taggers run.
- Dual-path parity harness (step 3b): runs codescan.Run twice per
fixture, once per flag value, diffs the results via the
NormalizedCommentView v1/v2 adapters. This is how every P5.x
migration verifies parity.
- P6 cutover: flag removed, grammar parser becomes the only path.
codescan.Options is a type alias for scanner.Options, so
codescan.Run callers see the field immediately — no public-API
shim needed.
Planning: .claude/plans/p5-builder-migrations.md §5.1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Add internal/integration/parity_test.go: runs every fixture twice,
once per UseGrammarParser value, and asserts the resulting
*spec.Swagger values are JSON-equal. Twenty-one fixtures covering
enhancements/* and goparsing/petstore + goparsing/bookings.
Design rationale — spec-level compare, not view-level:
- Measures the user-observable contract (the spec). v1 and v2
producing identical specs through different internal paths is
by definition not a user-observable difference.
- Reuses the existing fixture corpus (21 TestCoverage_* fixtures
become 21 parity cases) with zero reconstruction.
- No lossy reverse-engineering from post-build spec to
per-comment-group views — the alternative v1-adapter approach
would have had to map Schema.Properties["foo"] back to its
source *ast.CommentGroup, which is intrinsically fragile
(multi-comment sites, synthesised fields).
- Failure messages surface the exact diverging JSON path, which
is what you need to debug.
See .claude/plans/p5-builder-migrations.md §5 for the full
discussion (rejecting the view-level adapter).
With the flag currently a no-op, the suite passes trivially.
Validates the harness plumbing (parallel t.Run, Options cloning,
WorkDir injection, error handling) before bridge-taggers land in
step 6. When the flag starts flipping pipelines, TestParity
becomes the per-commit safety net.
Excluded fixtures:
- UnknownAnnotation — intentionally an error-expected path, no
spec to compare.
- malformed/* — same reason.
These error paths get their own non-parity coverage in the
existing TestCoverage_* suite.
Tactical test — **P6 cutover deletes this file in the same commit
that removes Options.UseGrammarParser** (grammar-parser-tasks.md
P6.4 records the obligation; p5-builder-migrations.md §5.3 the
rationale). With no flag, the test has no dual paths to diff and
would become pure CI burden.
Also cross-referenced from forthcoming-features.md §5.0.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Per W2 §3 decision, introduce the enum sub-parser that the schema /
parameters / responses bridge-taggers will call at step 8+ of the
P5 migration. Sibling of internal/parsers/yaml/ — imported only by
the analyzer layer; the main grammar parser stays oblivious to
enum value shape (verified: grammar package does not import enum).
API — deliberately minimal:
func Parse(raw string) (values []any, fallbackErr error)
Shape detection:
- Leading `[` (after TrimSpace) -> JSON-array path via
encoding/json. Non-scalar values (objects, arrays, null) pass
through natively, supporting OAI's any-JSON-type enum semantics
(W2 §2.4 / forthcoming §1.3 audit).
- Otherwise -> comma-list, each value TrimSpace'd. This **fixes
v1's case-B whitespace quirk** (W2 §2.6): where v1 produced
`["red", " green", " blue"]`, v2 produces `["red","green","blue"]`.
Narrow JSON detection by design: only a leading `[` triggers the
JSON path. Inputs like `{"k":"v"}`, bare `null`, `42` go through
the comma-list path (matching v1 parity: users who want structured
values wrap them in `[...]`). Documented and locked in via
TestParseDetectionIsNarrow.
Fallback behavior: when input LOOKS like JSON (leading `[`) but
fails to parse, Parse falls back to comma-list AND returns a
non-nil fallbackErr wrapped with an "enum:" prefix. Bridge-taggers
surface this as a SeverityWarning diagnostic rather than aborting,
matching v1's forgiving semantics.
Empty/whitespace-only input -> (nil, nil). A fenced but empty enum
is a no-op.
Tests (enum_test.go) exercise:
- empty/whitespace handling
- comma-list: basic, whitespace-trimmed (case-B fix),
tab-separated, single value, dropped-empty-entries
- JSON array: strings, numbers (float64 per JSON rules), mixed
types incl. objects / arrays / null, commas-inside-strings
(which comma-list can't do), leading-whitespace tolerance
- Fallback path: malformed JSON retains error, error prefix
- Narrow-detection contract: {"k":"v"} / null / 42 stay
comma-list single-values
Exported sentinel ErrEmptyOrNullArray satisfies err113 for an
ambiguous-shape JSON result the caller may want to log.
Type coercion (e.g., "42" -> int64(42) when field type is int) is
deliberately NOT in the sub-parser — that's bridge-tagger
territory (W2 §3 / P5.1 plan doc §6). The sub-parser returns
JSON-inferred types; the bridge-tagger coerces using go/types
info at the call site.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
After the grammar2 migration, the only remaining consumers of internal/parsers are scanner (annotation classification) and the path-annotation parsers used by builders/routes / operations to read scanner-produced ParsedPathContent. Every other matcher helper from the v1 regex era now had zero call sites. Removed (with their backing regexes): - HasAnnotation, IsAliasParam - AllOfMember, AllOfName (was rxAllOf) - FileParam (was rxFileUpload) - Ignored (was rxIgnoreOverride) - AliasParam (was rxAlias) - StrfmtName (was rxStrFmt) - ParamLocation (was rxIn) - EnumName (was rxEnum) - NameOverride (was rxName) - DefaultName (was rxDefault) - TypeName (was rxType) - Rxf + every *Fmt format (rxMaximumFmt, rxMinimumFmt, rxMultipleOfFmt, rxMaxLengthFmt, rxMinLengthFmt, rxPatternFmt, rxCollectionFormatFmt, rxEnumFmt, rxDefaultFmt, rxExampleFmt, rxMaxItemsFmt, rxMinItemsFmt, rxUniqueFmt, rxItemsPrefixFmt) and rxRequired The internal helpers commentMatcher and commentSubMatcher fall with their last users; commentBlankSubMatcher and commentMultipleSubMatcher survive (still used by ModelOverride / ResponseOverride / ParametersOverride). Tests updated: - TestHasAnnotation, TestIsAliasParam, TestCommentMatcher, TestCommentSubMatcher gone. - TestSchemaValueExtractors trimmed to the rxModelOverride + rxParametersOverride cases that still have production callers; cartesianJoin / titleCaseVariants / verifyBoolean / verifyIntegerMinMaxManyWords / verifyMinMax / verifyNumeric2Words / verifyRegexpArgs / makeMinMax dropped along with the dead regex-matrix coverage they supported. - TestExtractAnnotation, TestCommentBlankSubMatcher, TestCommentMultipleSubMatcher kept verbatim. Diff: 4 files, −492 net. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
After the grammar migration left helpers/ with one symbol per use
case, the package's "grab-bag of bridge utilities" charter no
longer applied. Each survivor moves to the place that owns it now.
Deleted outright (zero callers):
- Setter (lines.go) — closure factory for SectionedParser
callbacks; nothing left to set.
- CleanupScannerLines (lines.go) — was scaffolding for
CollectScannerTitleDescription and
the old extensions parser.
- CollectScannerTitleDescription + its three regexes (title_desc.go)
— every consumer now reads
block.Title() / block.Description()
from grammar; the v1 heuristics
(blank-line split, ATX heading
promotion, punctuation-ends → title)
live in grammar's lexer as the
classifyProseRun heuristics.
Relocations:
- GetEnumBasicLitValue → internal/scanner/enum_value.go as
unexported enumBasicLitValue. Used by
exactly one site in scan_context.go; the
function coerces Go AST basic literals to
runtime values, which is scanner-layer
concern, not parsers-layer.
- SchemesList / SecurityRequirements → internal/builders/common/
routemeta.go. Both are shared by
builders/routes and builders/spec (and only
those two); common already hosts shared
build-layer utilities.
Inlined at the call site:
- JoinDropLast + DropEmpty had one combined caller in spec/walker.go
(the Terms-Of-Service body builder). DropEmpty + JoinDropLast is
equivalent to "join non-blank lines with \n" — landed as a small
local joinNonBlank in spec/walker.go.
internal/parsers/helpers/ directory removed. CLAUDE.md package
layout refreshed.
Test suite green (16 packages, no golden drift).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Audit of regexp imports across internal/ surfaced three small
matchers in the build path that did simple byte-class work the
regex engine wasn't really earning. All ported to hand-rolled
byte loops in the same files:
- yaml/list.go:rxLineLeader (`^[\p{Zs}\t/\*]*\|?`) — strip leading
whitespace/tab/slash/asterisk run plus optional `|` from a YAML
list-body line before unmarshal. Replaced with stripListLeader:
ASCII-only byte loop, same intent.
- spec/walker.go:rxStripTitleComments
(`^[^\p{L}]*[Pp]ackage\p{Zs}+[^\p{Zs}]+\p{Zs}*`) — strip the
Go-doc-comment `Package <ident>` prefix off a meta title.
Replaced with stripPackagePrefix: detect leading non-letter run,
match Package/package, eat whitespace, eat identifier, eat
trailing whitespace. ASCII-only — every Go source title in the
fixtures starts with an ASCII letter and uses ASCII spaces.
- spec/walker.go:httpFTPScheme (`(?:(?:ht|f)tp|ws)s?://`) —
locate the URL prefix in a `Name <email> URL` contact line.
Replaced with a six-entry urlSchemes lookup table and a
find-leftmost-match scan. Same prefix set, no engine spin-up.
Remaining regexp imports under internal/ — all kept by design:
parsers/{regexprs,matchers,parsed_path_content}.go — scanner
classification fast path; restructuring is deferred per the
"ast bits of interest" work item.
scanner/index.go — user-facing package include/exclude patterns
take regex by API contract.
schema/walker.go — RE2 compile check on user-supplied `pattern:`
values; correctness requires the actual engine.
Test suite green; no golden drift.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Verbatim 5-line buildOption helper was copy-pasted in
parameters/parameters.go and responses/responses.go. Both pick
between schema.WithType (for body schemas) and schema.WithSimpleSchema
(for non-body parameter / header sites) based on typable.In().
Hoist into the schema package where the Build modes already live:
func OptionFor(tpe types.Type, tgt ifaces.SwaggerTypable) Option {
if tgt.In() == "body" { return WithType(tpe, tgt) }
return WithSimpleSchema(tpe, tgt, tgt.In())
}
Both consumers already import schema, so swapping the local helper
for schema.OptionFor at every call site is mechanical. The local
buildOption definitions drop.
If a third Build mode ever lands, or the body/non-body gate gains
nuance (allowEmptyValue exception, etc.), the dispatch becomes a
one-place edit.
No behaviour change. Full suite green; no golden drift.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…esList Grammar already had an unexported `splitCommaList` powering Block.GetList for ShapeCommaList keywords. Export it as grammar.SplitCommaList and reuse it at the two routes/spec call sites that read a raw Property.Value (the dispatchers can't use Block.GetList without restructuring the keyword switch). common.SchemesList — a 10-line wrapper around the same algorithm — retires. common/routemeta.go shrinks to just SecurityRequirements, which has no equivalent on grammar's typed surface (the "name: scope1, scope2 word truncates" line shape with the v1 quirk is meta/route body specific). A short doc-block on routemeta.go now explains why the file exists at all — single utility, two callers, retire-or-hoist on the next data point. No behaviour change. Full suite green; no golden drift. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
New fixture under fixtures/enhancements/parameters-map-postdecl/
exercises a long-standing bug in parameters.buildFromFieldMap:
the schema sub-builder's PostDeclarations are silently dropped
instead of being propagated to the parameters builder's chain.
Repro shape:
- LocalItem — NOT swagger:model annotated; only reachable via the
map field below.
- MapParams (swagger:parameters mapBody) — body field of type
map[string]LocalItem.
- swagger:operation POST /items mapBody — declares the operation
so the parameters declaration has somewhere to attach.
The buggy golden captured here shows the inconsistency clearly:
"schema": {
"type": "object",
"additionalProperties": { "$ref": "#/definitions/LocalItem" }
}
…but the spec has no "definitions" section at all. The $ref
points at a definition that does not exist. Downstream tooling
(generators, validators) would either fail loudly or silently
drop the property type.
The bug exists because every sibling builder method —
buildFromFieldStruct, buildFromFieldInterface, buildNamedField,
buildFieldAlias — propagates sb.PostDeclarations(). Only
buildFromFieldMap forgets the loop. Likely an oversight from when
the schema-builder factor-out landed (Stream M2.5).
The fix is one for-loop. Landing it in the follow-up commit
regenerates this golden — the diff there will show LocalItem
appearing in definitions, witnessing the resolution against this
witnessed-broken baseline.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…uild
The one-loop fix to the bug witnessed in the previous commit:
parameters.buildFromFieldMap now propagates the schema
sub-builder's PostDeclarations to the parent builder's chain,
matching what every sibling buildFromFieldXxx method already
does.
The witnessing golden delta:
+ "definitions": {
+ "LocalItem": {
+ "description": "LocalItem — NOT annotated; reachable only via the map field on\nMapParams below.",
+ "type": "object",
+ "properties": {
+ "name": { "type": "string", "x-go-name": "Name" },
+ "tag": { "type": "string", "x-go-name": "Tag" }
+ },
+ "x-go-package": "github.com/go-openapi/codescan/fixtures/enhancements/parameters-map-postdecl"
+ }
+ }
The earlier golden referenced #/definitions/LocalItem but had no
definitions block. Post-fix, the referent exists.
No other golden drift — verified with full go test ./...
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Grammar gains a sibling sub-parser for the `Security:` block body — internal/parsers/security/, modelled on the yaml sub-parser that already handles extensions:. The lex-time dispatch in grammar.emitRawBlock calls security.Parse on the `security:` raw body and stores the typed []security.Requirement on baseBlock. Block exposes it via SecurityRequirements(). Builders consume it the same way they consume Extensions: read off the block after the property loop, drop the per-keyword dispatch case. routes/walker.go and spec/walker.go each shed the KwSecurity case in dispatchRouteKeyword / dispatchMetaSimple. common/routemeta.go retires entirely — SecurityRequirements was its last function (SchemesList moved to grammar.SplitCommaList in M6.4-C3). The common package shrinks back to just *common.Builder. The v1 quirk (per-scope whitespace truncation at the first word) is codified in security.Parse with a comment explaining the intent. No behaviour change vs the prior common.SecurityRequirements implementation; fixtures only exercise single-word scopes today. Full suite green; no golden drift. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…Block
Two single-line value parsers move from spec/walker.go into
grammar — option 1 in our M6.4 discussion (trivial enough that a
sibling sub-parser is overkill; the parse logic lives in
grammar/meta_info.go).
New types and accessors on grammar.Block:
- type Contact struct { Name, Email, URL string }
- type License struct { Name, URL string }
- Block.Contact() (Contact, error) — error surfaces malformed
`Name <email>` heads
- Block.License() (License, bool) — bool is "found", no parse
failure path
The error return on Contact preserves the existing v1 contract:
TestMalformed_BadContact still aborts the build on a malformed
contact line. License has no parse failure mode (Name and URL
just split on the URL scheme), so the simpler (License, bool)
shape applies.
Both accessors are lazy — they iterate properties on call.
Single-line input, trivial cost; no lexer changes needed.
Migrated out of spec/walker.go:
- parseContactInfo, parseLicense, splitURL, urlSchemes (~60 lines)
- net/mail import
spec/walker.go reads the typed accessors after the property loop,
same shape as block.SecurityRequirements() from the previous
commit.
Full suite green; no golden drift; TestMalformed_BadContact still
fails as expected on malformed input.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
… refactor Captures the legacy swagger:route body sub-language behaviour (Parameters: and Responses: blocks parsed by builders/routes/body_params.go and builders/routes/body_responses.go) across 28 focused fixtures with golden JSON. Existing route coverage was petstore-shaped (Consumes/Produces/Schemes/ Responses/Security/Extensions) with ZERO integration goldens exercising the `+ name:` chunk grammar — when the legacy parsers retire in M6.5-C the in- package unit-test coverage retires with them. This commit closes that gap. These goldens are WITNESS goldens for the M6.5 refactor: they lock the current buggy output so M6.5-B/C diffs surface each fix landing as the rewrite naturally delivers checkShape diagnostics (M6.5-B) and clean routebody design (M6.5-C). Quirks logged as Q14-Q22 in .claude/plans/observed-quirks.md. Coverage: - Parameters block: path/query/header/form/body in: values; string/integer/ number/boolean/array types; min/max/minlength/maxlength/format/default/ enum/required/allowempty validations; body refs with [] and [][] array nesting; chained multi-param; unknown-key silent drop; empty + chunk; schema overrides on body refs. - Responses block: positional (untagged), tagged body:/response:/description:; mixed body+response refs; description-only; default; array nesting; empty value; definition-fallback (found); ref-not-found (dangling); multi-codes. - Combined: full petstore-shape route; multi-method same-path. - Quirk fixtures: space-body misinterpretation, ref-not-found dangling. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…lers package Pure code motion. The level-0 and items-level dispatchers used by the parameters and responses builders move into the handlers package alongside the per-shape callback factories they orchestrate. Tests and integration goldens unchanged: byte-identical output, verified against the M6.5-PRE witness suite. Moved into the new handlers/dispatch_simple.go: - DispatchParamLevel0 (from parameters/walker.go:dispatchParamLevel0) - DispatchHeaderLevel0 (from responses/walker.go:dispatchHeaderLevel0) - DispatchItemsLevel — collapses parameters' walkItemsLevel and responses' walkHeaderItemsLevel; their bodies were byte-identical modulo the diagnostic source. - paramValidations adapter (from parameters/typable.go) - headerValidations adapter (from responses/typable.go) - paramRequiredBool helper (private to handlers) The parameters/responses walkers now hold only AST-side orchestration (applyBlockToField, collectXxxItemsLevels) and delegate the grammar-side dispatch via handlers.DispatchParamLevel0 / handlers.DispatchHeaderLevel0 / handlers.DispatchItemsLevel. paramTypable and responseTypable stay in their packages — they implement SwaggerTypable for upstream wiring, distinct from the SimpleSchema validation adapters that moved. This unblocks M6.5-C: routebody-emitted grammar.Block instances will dispatch through the same handlers seam parameters/responses use today, eliminating the duplication of validation logic that builders/routes/body_params.go currently carries. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…ackage
Pure code motion. The full-Schema dispatch quartet (level-0 and items-
level dispatchers, plus the per-shape Number/Integer/Bool/String/Raw
handler factories) moves out of schema/walker.go into the handlers
package, alongside the SimpleSchema dispatchers already there from
M6.5-A. Tests and integration goldens unchanged: byte-identical output
verified against the M6.5-PRE witness suite.
Moved into the new handlers/dispatch_schema.go:
- DispatchSchemaLevel0 (was schema/walker.go:walkSchemaLevel)
- DispatchSchemaItemsLevel (was schema/walker.go:walkItemsLevel)
- SchemaValidations adapter + NewSchemaValidations constructor
(was schema/typable.go:schemaValidations)
- ApplyPattern (was schema/walker.go:applyPattern; exported because
schema's refOverrideCollector still uses it)
- SetRequired, SetDiscriminator (was schema/walker.go:setRequired/
setDiscriminator; SetRequired exported for refOverrideCollector)
- SchemaTypeOf (was schema/walker.go:schemaTypeOf; exported for
refOverrideCollector)
- clearStaleEnumDesc (was schema/extensions.go; unexported, called
from SchemaValidations.SetEnum)
- schemaNumberHandler, schemaIntegerHandler, schemaBoolHandler,
schemaStringHandler, schemaRawHandler (per-shape factory closures,
unexported)
- checkShape (unexported gate)
SchemaOptions{SimpleSchemaMode bool} replaces the Builder.simpleSchema
flag at the dispatcher boundary. The schema package's Builder exposes
a one-line schemaOpts() helper that packages its mode flag into the
struct on each call. The SimpleSchema-mode gate inside schemaBoolHandler
behaves identically — it emits CodeUnsupportedInSimpleSchema for
full-Schema-only keywords and silently skips required:.
The schema package keeps refOverrideCollector (field-on-struct
$ref-override allOf rewrite — not relevant to routes or other
consumers), applyBlockToDecl / applyBlockToField (AST-side entry
points), applyToRefField (collector driver), and flattenItemsTargets
(AST array-layer walk). The extensions.go file went away entirely
when clearStaleEnumDesc moved.
This unblocks M6.5-C: routebody-emitted grammar.Block instances for
body params/responses dispatch through handlers.DispatchSchemaLevel0
with no enclosing (nil) and no name (""), inheriting checkShape
gating + ApplyPattern's RE2 hygiene check + ParseDefault coercion —
the same validation engine struct fields go through today, without
re-implementing any of it in builders/routes/body_params.go.
This commit also delivers Q15 + Q16 fix-readiness: the checkShape
gate is now uniformly available on the routes dispatch path that
M6.5-C will wire up. (The golden delta for Q15/Q16 lands when M6.5-C
ships, not here, because routes still uses its legacy body_params.go
parser at this point.)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Replaces the legacy regex+state-machine parsers in
builders/routes/body_params.go and body_responses.go (479 LOC retired)
with a hand-rolled tokenizer in the new internal/parsers/routebody
package + per-decl dispatch through handlers.DispatchParamLevel0
(SimpleSchema) and handlers.DispatchSchemaLevel0 (body schema). The
synthetic grammar.Block constructor (NewSyntheticBlock) lets routebody
emit standard Blocks the dispatchers consume without bespoke wiring.
# Architecture
- internal/parsers/routebody/{doc,parameters,responses,diag}.go: pure
tokenizer over the legacy `+ name:` chunk grammar (parameters) and
the positional `<code>: <tokens>` line grammar (responses). Returns
typed ParamDecl / ResponseDecl values; ParamDecl carries the
validation properties as a synthetic Block.
- internal/parsers/grammar/synthetic.go: NewSyntheticBlock factory.
- internal/builders/routes/walker.go: rewritten dispatchParameters /
dispatchResponses orchestrate routebody output. Body params type-
gate via validations.IsLegalForType (Q15/Q16-equivalent on routes
SimpleSchema); body schemas via DispatchSchemaLevel0's checkShape.
# Fixes (Q14-Q22 per .claude/plans/observed-quirks.md)
The PRE goldens shift to witness each fix landing as planned:
- Q14: body param description no longer duplicates onto param.Schema.
- Q15/Q16: min/max/format on body refs now uniformly gate via
checkShape; SimpleSchema params type-gate via typeGateBlock.
- Q17: bare `+` chunks now drop (with diagnostic) rather than emit
empty parameter objects.
- Q18: unknown parameter keywords emit CodeInvalidAnnotation rather
than silent-drop.
- Q19: empty `204:` value handled cleanly.
- Q20: `body Foo` space-separated form detected as typo; drop with
diagnostic rather than mis-parse as response="body".
- Q21: leading-space artifact on tail-of-line descriptions removed
(routebody strips leading whitespace from the accumulator).
- Q22: dangling response refs (name in neither responses nor
definitions) drop with diagnostic rather than emit invalid $ref.
# Accept `- ` as chunk-start alias
Routebody accepts `- ` alongside the canonical `+ ` chunk-start sigil,
preparing for future YAML-style authoring (v2). No further YAML
semantics are implied at this point.
# Behaviour changes beyond the planned Qs
- `pattern:` on inline route parameters now lands on param.Pattern
(the legacy parser silently dropped pattern: from inline params —
it wasn't in the applyParamField switch).
- minLength/maxLength on array params now drop (with diagnostic) per
OAS v2: those are string-typed constraints. Use minItems/maxItems
for arrays. The legacy mis-applied them.
- Malformed-input handling: duplicate body/response tags and unknown
tags now emit CodeInvalidAnnotation and drop the response, rather
than failing the entire build with a hard error. The
TestMalformed_DuplicateBodyTag / BadResponseTag tests updated to
check successful Run + captured goldens.
- TestRoutesParser / TestRoutesParserBody now seed the responses map
with the names the classification fixture references, since Q22
requires resolvable refs. Some hardcoded validation-pointer
assertions updated to nil for type-gated array params.
# Net code change
- Routes: -479 LOC (body_params.go, body_params_test.go,
body_responses.go, body_responses_test.go, setters.go all retired).
- Routebody: +~370 LOC (4 files; clean tokenizer + diagnostics).
- Routes walker: +~200 LOC of orchestration replacing the deleted
dispatchRouteKeyword body branches.
- Grammar synthetic factory: +30 LOC.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…r strip
`builders/routes/walker.go:trimCommentPrefix` was a routes-side
workaround for a coupling between `parsers.parsePathAnnotation` and
the grammar lexer's `stripComment`: parsePathAnnotation reshapes a
multi-line `/* ... */` block-comment route doc into per-line synthetic
`*ast.Comment` entries. After the swagger:route header line is
stripped, the per-line entries have NO `//` or `/*` prefix, so
`stripComment`'s `default:` branch fires — it returns the line
verbatim without running `trimContentPrefix`. Leading tabs / spaces
then survive into Title / Description; routes papered over it.
Now: parsePathAnnotation prepends `// ` to each synthetic per-line
comment (via a tiny `ensureCommentMarker` helper). Grammar's lexer
takes the `//` branch and runs `trimContentPrefix` (strips ` \t*/|`),
shedding leading whitespace exactly as it does for `//`-comment
routes. `trimCommentPrefix` and its godoc retire — the comment was
inaccurate anyway (it blamed grammar's lexer for the leak when the
actual cause was parsers' synthetic ast.Comment construction).
Two intentional behaviour shifts captured as Q23 in
.claude/plans/observed-quirks.md:
- Markdown dash lists in route descriptions now survive. The
legacy trimCommentPrefix stripped leading `-` too; the new
trimContentPrefix path preserves them. Fixture:
routes-description-dash-list.
- `---` in route descriptions now triggers the YAML-fence absorber
consistent with every other annotation. Subsequent prose
(and any keyword blocks after it) is captured as YAML body and
drops from the visible spec. The legacy stripped `---` to empty,
masking this. Routes don't use YAML blocks, so this is a sharp
edge — pathological authors only. Fixture:
routes-description-yaml-fence-absorb.
Existing route goldens unchanged (no fixture in the suite carried a
`-` or `---` at line start in its description); the two new fixtures
are pure witnesses for the new contract.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…AsList
Replaces three ad-hoc list helpers (grammar.SplitCommaList,
yaml.ListBody, routes.bodyLines) with a single flex-list accessor
on grammar.Property. Block.GetList delegates to it. Every consumer
of list-shaped keywords (schemes / consumes / produces on routes,
spec/meta) now reads through the same seam, with the union of
accepted surface forms:
Schemes: http, https # inline, comma-separated
Schemes: # multi-line, indented bare lines
http
https
Schemes: # multi-line, YAML `- ` markers
- http
- https
Schemes: http # inline + indented continuation
- https
# Three latent issues fixed along the way
- Inline raw-block values were silently lost. `Consumes:
application/json` on a single line produced an empty Body
because collectRawBlock skipped head.Text entirely (only
collectRawValue had a single-line capture path). Prepending
head.Text to bodyText restores parity.
- Multi-line `Schemes:` was silently lost. KwSchemes was
registered as asCommaList(), which the lexer doesn't expand
into a body. Now asRawBlock(); combined with the inline-value
capture above, every form works.
- Different code paths per keyword maintained subtle inconsistencies
(routes accepted comma for Schemes but not multi-line; meta
silently lost inline Consumes; etc.). Eliminated by funnelling
everything through Property.AsList.
# Algorithm (Property.AsList)
For each input line — Value (if non-empty) + each line of Body:
trim whitespace; drop a leading `- ` YAML marker if present; re-trim;
comma-split; trim each token; drop empties. Aggregate.
# Stops at simple token lists
Does NOT touch:
- enum values (whose elements may be complex / JSON arrays);
- the `+ name:` Parameters chunk grammar (routebody-owned);
- YAML structural bodies (securityDefinitions, extensions,
infoExtensions — those parse through yaml.TypedExtensions /
json.Unmarshal because their structure isn't a simple list).
# Retired helpers
- parsers/yaml/list.go (yaml.ListBody + stripListLeader) — file
deleted.
- grammar.SplitCommaList — deleted.
- routes.bodyLines — deleted from routes/walker.go (still lives
in spec/walker.go for KwTOS, which is a prose-join, not a
list).
# Witness fixtures
- routes-lists-flex-forms: four routes, each using a different
surface form. All resolve to the same token-list shape.
- meta-lists-flex-forms: swagger:meta with mixed forms for
schemes/consumes/produces.
# Backward compat
Every existing route fixture's goldens unchanged — they all used
forms that already worked under one path or another. The two
witness fixtures capture the now-unified contract.
Logged as Q24 in .claude/plans/observed-quirks.md.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Pre-M6.5-E meta diverged from every other annotation on extension
handling:
- Routes / operations: consume block.Extensions() (typed surface
from grammar). Non-x-* keys silently dropped at parse time, no
diagnostic.
- Meta: dispatch KwExtensions / KwInfoExtensions through
yamlparser.TypedExtensions(p.Body) + validateExtensionNames,
hard-erroring out of codescan.Run on any non-x-* key.
That meant double YAML parsing (grammar + meta), an inconsistent
error contract across annotation families, and no diagnostic
signalling for typos in routes / operations.
Unified contract: grammar emits CodeInvalidAnnotation in
collectExtensionsFromBody when isExtensionName rejects a key, then
drops it. Extension.Source now carries the keyword name
("extensions" / "infoExtensions") so meta can route entries to
swspec.Extensions vs swspec.Info.Extensions; consumers that don't
care (routes, operations) just ignore it.
Retired in builders/spec/walker.go:
- validateExtensionNames
- ErrBadExtensionName
- the two yamlparser.TypedExtensions calls in dispatchMetaYAMLBlock
dispatchMetaYAMLBlock now handles only KwSecurityDefinitions (the
one structural-YAML body left); extensions / infoExtensions flow
through a single block.Extensions() loop in applyMetaBlock.
stripPackagePrefix simplified in the same commit: replaced the
~35-LOC byte-loop with strings.CutPrefix + TrimSpace. Same
semantics: skip leading whitespace, match `Package `, drop the
identifier, return the tail. Rejects lowercase `package` so prose
like "package this carefully" doesn't get silently chopped (the
legacy accepted both; only the capital-P godoc convention is
honoured here).
TestMalformed_MetaBadExtensionKey and
TestMalformed_InfoBadExtensionKey switched from require.Error to
golden-comparison witnesses — they now expect Run to succeed with
the bad key absent from the emitted spec, matching the
diagnose-and-drop posture M6.5-C established for routes.
Logged as Q25 in .claude/plans/observed-quirks.md.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Caught during the M6.5 observed-quirks audit pass. The M6.5-C entry in observed-quirks.md claimed Q21 was resolved (routebody's description accumulator would strip leading whitespace), but the re-captured goldens still showed `" OK"`, `" not found"`, etc. — the legacy leading-space artifact was never actually stripped, and the M6.5-PRE re-capture had locked the still-buggy state under the false assumption it would land via routebody. Cause: routebody/responses.go's `description:`-tag branch appended the (empty) post-colon val from a token like `description:` before the tail tokens, so `strings.Join(["", "OK"], " ")` produced `" OK"`. The legacy parseTags had the same bug; routebody inherited it. Fix: skip the empty val before appending to descTokens. The tail tokens join with single spaces between non-empty entries; no leading space. Goldens shift for every response fixture using the `description: X` form — primarily the route fixtures plus a handful of pre-existing classification / petstore goldens that carry the same shape on inline `description:` lines. Audit-trail lesson: the witness-then-fix pattern relies on goldens actually showing the post-fix shape. When the fix doesn't land but the re-capture happens anyway, the harness silently locks a half-state. The refresh audit is precisely where that drift surfaces. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Applies the M6.6 sweep recipe to builders/common: strip historical
narration, lift dense prose into a new package README, apply the M0
`# Details` pattern to the godoc that remains.
Specifically:
- Dangling `Implementation notes (temporary)` block at the file
tail (blockCache rationale + the
.claude/plans/grammar/p7.1-... pointer) lifted to
common/README.md §blockcache. Replaced inline with a
`# Details` reference on `ParseBlocks`.
- `MakeRef` godoc's "Same shape as ... used to have" historical
justification rewritten as a present-tense statement of why
the operation lives on the common base (§makeref).
- `Diagnostics` parenthetical `(per plan §11 Q2 - "watch and
improve when we start doing some serious work with
diagnostics")` lifted to §diagnostics as the LSP-evolution
caveat. The "Experimental" disclaimer folded into the same
paragraph.
- Two TODO(fred) markers (slog logger configurability; ireturn
lint posture on ParseBlock) moved to §quirks-open as deferred
follow-ups. The nolint directive itself retained — only the
inline TODO archaeology moved.
- Stray `// repo` comment removed.
- Package godoc added — points readers at the README from the
top of the file.
Code behaviour unchanged; only godoc + README.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
The godoc on GetEnumDesc pointed at `schema/extensions.go:clearStaleEnumDesc` but the function actually lives in `handlers/dispatch_schema.go`. Update the reference so the breadcrumb leads where the code does. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Apply the M0 documentation recipe to internal/scanner/ production
files: strip historical narration, drop deprecated grammar-evolution
references, and lift dense maintainer prose into a new
internal/scanner/README.md with anchored sections.
What moved where:
- options.go: stripped "Matches v1's strict default" historical
narration on DescWithRef; the bare-$ref shape is now described as
the current default with no time reference. Added a struct-level
godoc on Options that points at §options / §descwithref /
§diagnostics.
- scan_context.go: rephrased FileSet/OnDiagnostic/GetModel/
FindModel/AddDiscoveredModel godocs to the M0 short-intent-plus-
README-anchor shape. "previously-registered" on GetModel rewritten
as a current-state description ("discovered decls already present
in ExtraModels") and lifted into §model-lookup.
- index.go: detectNodes godoc condensed to one line plus a
§classifier README anchor; the bitmask/exclusivity rules now live
in the README.
- New README.md sections: §options, §descwithref, §diagnostics,
§model-lookup, §classifier, §quirks-open — mirrors the M0 shape
established by internal/builders/{common,schema}/README.md.
Code logic is unchanged — diff is godoc, comments, and the new
README only. Tests and golangci-lint --new-from-rev=HEAD pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Apply the M6.6 comment-summarisation sweep to internal/builders/handlers: - Strip "v1 accepts any string", "v1 never propagated default/example parse errors", "responses' parity — v1 ..." narration from CollectionFormatString, Raw, and DispatchHeaderLevel0; rewrite as present-tense contract statements. - Collapse the long package godoc into an intent-focused sentence plus a "# Details" pointer to the new README. - Fix a stale code-pointer in keywords.go (parameters/walker.go → dispatch_simple.go) and replace the inline carve-out paragraph with a README anchor. - Lift dense rationale (Raw errSink contract, simpleSchema allow-list invariants, stale x-go-enum-desc cleanup, SimpleSchema vs full-Schema dispatch split, vendor-extension routing, collectionFormat fallback) into a new internal/builders/handlers/README.md following the M0 lift shape. Code logic unchanged — only godoc, package comments, and a new README. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Apply the M6.6 comment-summarisation sweep to internal/builders/operations: - Strip the "v1 only ever consumed one fenced body per operation; we preserve that" narration from applyBlockToOperation; the single-fenced-body contract stands on its own as a present-tense statement. - Collapse the dense applyBlockToOperation godoc to a one-sentence intent plus a "# Details" pointer to the new README §walker. - Trim the Builder/setPathOperation godoc to intent-focused statements and route maintainers to README anchors for the slot-reuse and Build-orchestration prose. - Lift the dense rationale into a new internal/builders/operations/README.md following the M0 lift shape, with sections for the builder, the path-item-slot reuse, the walker contract, and the open single-fenced-body quirk. Code logic unchanged — only godoc, package comments, and a new README. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Strip historical-narration tics from schema-package godoc that referenced "v1 behaviour", "parity with v1", "legacy" helpers, and an internal plan section pointer. Comments now describe current-state behaviour or lift the rationale (extension SkipExtensions scope, items-chain ownership on named/alias arrays, $ref allOf compound shape). Logic-neutral — diff covers only doc-comment lines; tests pass and golangci-lint reports zero issues. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Strip historical narration and internal plan pointers from the validations package godoc and consolidate the rationale into a new maintainer-notes README following the M0 §section pattern. Code logic is unchanged — only comments, godoc, and the README. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Apply the M6.6 comment summarisation recipe to internal/parsers/yaml: strip historical narration (round-1 / round-2, "legacy" qualifiers, "grammar v1 does NOT import this package"), collapse the two `.claude/plans/typed-extensions.md` references, and lift the typed-extensions rationale (YAML → JSON normalisation, dedent strategies, sibling-sub-parser seam) into a new `internal/parsers/yaml/README.md` following the canonical M0 `# Details` template with TOC anchors. Also fix a stale `Parse` godoc reference to a `pos` parameter that the current signature does not carry. Code logic unchanged. `go test ./...` green; `golangci-lint run --new-from-rev=HEAD ./internal/parsers/yaml/...` zero issues. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Sweep `internal/builders/routes/` godoc and the package README to strip Q-number references, "legacy parser" / "matches v1 behaviour" narration, and historical migration commentary. Rewrite each affected comment as a present-tense statement of the current contract; refactor the README to drop references to files that no longer exist in the package and to land the M0 long-form pattern (table of contents, table of files, per-section anchors). No code logic changes — godoc and Markdown only. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…rchaeology Move the long-form Parameters: / Responses: sub-language grammar from doc.go into a maintainer README, leaving doc.go with a current-state contract pointing at the README. Drop migration narration (Q-numbers, "legacy parser" callouts, body_params.go line refs) from every file in the package; rewrite the affected sentences as present-tense behaviour statements. Code logic is unchanged; this is a comment-only sweep. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
…lan refs Adds internal/parsers/grammar/README.md as the long-form companion to the grammar package source; strips historical narration, plan-file references (`.claude/plans/grammar/*.md`), and v1/round-1/grammar2 framing from godoc throughout the package. Code logic is unchanged — every diff sits inside `//` comments. README sections created: - §overview / §pipeline — what the package parses, three-stage flow - §preprocess-contract — comment-marker stripping rules - §lexer-contract — line classifier, body accumulator, prose classifier; swagger-prefix and godoc-route exception; directives; items-prefix; trailing-dot elision; first-character case-insensitivity - §prose-classification — TITLE/DESC split heuristics (lifted from the dense classifyProse / classifyProseRunsInPlace / classifyTitleDescRun godoc) - §raw-block-terminators — the sibling-terminator rule, inline-value capture, per-body indentation handling (lifted from collectRawBlock / isSiblingTerminatorFor) - §yaml-fence-handling — opaque YAML bodies and decorative fences - §disambiguation — default / enum-args / type-ref / HTTP-method value-shape dispatch - §parser-contract / §block-shapes — family dispatch table, typed Block kinds and their fields (lifted from Block interface godoc, ParseAll godoc, parseState godoc) - §property-shape — Property / TypedValue / IsTyped / AsList (lifted from Property godoc and AsList's surface-form table) - §walker-contract — full Walker dispatch table, FilterDepth rules, concurrency contract (lifted from Walker godoc — a 60-line block collapsed to one sentence + Details pointer) - §keyword-table / §context-legality — closed-vocab keyword classification and per-annotation context legality - §annotation-args — per-annotation argument terminals and validation - §typed-extensions / §security-requirements / §contact-license — body→typed accessors - §diagnostics — Code / Severity model and the full code list - §synthetic-block — NewSyntheticBlock factory - §quirks-open — deferred follow-ups (body-shape, position fidelity, closed-vocab prefix) Files touched: doc.go, ast.go, lexer.go, parser.go, walker.go, keywords.go, diagnostic.go, annotations.go, disambiguate.go, preprocess.go, token.go, synthetic.go. All tests green; golangci-lint --new-from-rev=HEAD reports zero new issues. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Replaces the stale auto-generated docs/annotation-keywords.md with
four hand-authored topic-split docs under ./docs/. Each carries
minimal Hugo frontmatter (title + weight); author-leaning content
shows first (lower weights), implementer content later.
- annotations.md (weight 10) — author-first cheatsheet of every
swagger:* annotation with placement rules, argument shapes,
code samples, and a final annotation x keyword compatibility
matrix.
- keywords.md (weight 20) — keyword reference card grouped by
family (numeric / length / format / schema decorators /
parameter location / meta single-line / body). Per-keyword
contracts with samples drawn from real fixtures.
- sub-languages.md (weight 30) — the embedded languages: prose
classification, flex-list reader, Parameters / Responses body
grammars, YAML extensions, security requirements, contact /
license inline forms.
- grammar.md (weight 40) — formal EBNF lifted from the internal
plan and actualised against current code. Lexer terminal
vocabulary, top-level dispatch, per-family productions
(schema / operation / meta / classifier), cross-cutting
productions, walker contract, diagnostics model. Closing
section on what the grammar deliberately does NOT describe
(coercion, $ref resolution, etc. — analyzer territory).
Each doc cross-links to the others with relative paths. Code
samples are hybrid: inline snippet (so the doc reads end-to-end)
plus a one-line fixture path for the full executable example.
Per the M6.6 "one grammar only" rule, no traces of "grammar v1",
"grammar2", "v2 grammar", "the new grammar parser", "post-Stream-M",
or "round-1/round-2 typed-extensions" appear in the docs — the
published surface presents a single grammar that always existed in
its current form. The branch's multi-version refactor history stays
in .claude/plans/ (gitignored) and git log; the docs read
forward-only.
The old generator is not reintroduced. docs/annotation-keywords.md
(carried a STALE banner pointing at the deleted generator code)
is removed.
The doc is v0 by intent. Some users will benefit from expanded
keyword samples and more worked end-to-end examples in a future
pass. The current shape is "ideal for maintainers, okay for users."
The eventual Hugo-published site will pick up the frontmatter
weights to order the navigation. A future doc generator (see the
testify codegen reference in the post-merge backlog) could augment
the hand-authored content; that's a separate project.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Change type
Please select: 🆕 New feature or enhancement|🔧 Bug fix'|📃 Documentation update
Short description
Fixes
Full description
Checklist