harden: float-syntax refusal, strict unknown-field reject, CSV/syslog escaping, uniform hex normalization#10
Merged
Conversation
…/syslog escaping, uniform hex-anchor normalization Second adversarial bug-hunt pass. Each item was reproduced with an executable PoC before fixing; the full test suite, clippy -D warnings, fmt, demo-seeder (seed 42), wasm-pack build and the Node integration all pass, and the demo chain root is unchanged. canonical: refuse every float-syntax JSON number, including integer-valued floats such as 2.0 and -0. serde_json's decimal->f64 rounding diverges from a JavaScript JSON.parse near 2^53 (the token 9007199254740991.0 parses to 9007199254740990.0 under serde_json but to 9007199254740991 under V8), and the original token is gone by the time canonicalization runs, so the value a JS verifier would compute cannot be reproduced. Integer-syntax numbers up to 2^53-1 are unaffected. Removes the integer_valued_float cross-language vector. verify_demo (strict): reject any record key that is not a recognized field or alias, so a flipped, renamed, or injected field name can no longer ride along on an otherwise-valid record. The lenient profile keeps tolerating unknown fields for heterogeneous producers. verify_demo / verify / lib: normalize every caller-supplied hex anchor (pinned pubkey and expected root) identically across surfaces - trim, strip an optional 0x/0X prefix, lowercase - so the CLI, the wasm facade and a direct core caller accept the same inputs. A root that normalizes to empty is treated as "no anchor", matching the facade. export (CSV): prefix any cell beginning with = + - @ tab or CR with a single quote to neutralize spreadsheet formula injection from attacker-controlled event_type/source. export (syslog): escape U+0085, U+2028 and U+2029 in the quoted MSG body so a Unicode-newline-aware SIEM cannot be tricked into splitting a record.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Second adversarial bug-hunt of the public verifier. The core cryptographic
contract held up: no reachable strict false-accept, no panic on hostile
input, canonical form idempotent and injective on broad random sampling.
This PR fixes the low-severity hardening items that the hunt surfaced. Each
was reproduced with an executable proof-of-concept before being fixed.
Fixes
canonical: refuse float-syntax numbers (cross-language parity).
serde_json's decimal-to-f64rounding diverges from a JavaScriptJSON.parsefor inputs near 2^53: the token9007199254740991.0parses to9007199254740990.0underserde_jsonbut to9007199254740991under V8.The original token is gone by canonicalization time, so the value a JS
verifier would compute cannot be reproduced, and emitting the mis-rounded
value would silently break the byte-for-byte cross-language contract.
Float-syntax numbers (including integer-valued floats like
2.0and-0)are now refused; integer-syntax numbers up to 2^53-1 are unaffected.
Producers already encode amounts as strings, so the demo is unchanged.
verify_demo (strict): reject unrecognized record keys.
serdesilently drops a key that is not aWalEntryfield or alias, so aflipped, renamed, or injected field name could ride along on an
otherwise-valid record while it still verified
Valid. The strict profilenow rejects such records (
RejectedReason::UnknownField). The lenientprofile keeps tolerating unknown fields for heterogeneous producers.
core: uniform hex-anchor normalization across surfaces.
The pinned pubkey and expected root are now normalized identically (trim,
strip an optional
0x/0X, lowercase) inspine-core, so the CLI, thewasm playground facade, and a direct core caller accept the same inputs.
Previously a
0x-prefixed or whitespace-padded anchor verified on the CLIbut errored in the browser. A root that normalizes to empty is treated as
"no anchor", matching the facade.
export (CSV): neutralize spreadsheet formula injection.
A CSV cell beginning with
=,+,-,@, tab, or CR is interpreted as aformula by Excel/LibreOffice/Sheets. Attacker-controlled
event_type/source(and a negativetimestamp_ns) are now quote-prefixed per theOWASP guidance.
export (syslog): escape Unicode line separators.
U+0085 (NEL), U+2028 (LINE SEPARATOR) and U+2029 (PARAGRAPH SEPARATOR) are
line breaks outside the C0 range; a Unicode-newline-aware SIEM could split
on them. They are now escaped in the quoted MSG body alongside CR/LF.
Validation
cargo test --workspace --all-features: green (added regression tests forevery fix above).
cargo clippy --workspace --all-targets --all-features -- -D warnings: clean.cargo fmt --all --check: clean.c36bd135...unchanged.Contract note
Two items tighten the input contract (refuse float-syntax numbers; strict
rejects unknown keys), consistent with the prior tightening that refused
integers >= 2^53. The
integer_valued_floatcross-language vector isremoved.