THRIFT-5802: Validate set uniqueness in Node.js library#3475
Open
KantConnect wants to merge 1 commit into
Open
THRIFT-5802: Validate set uniqueness in Node.js library#3475KantConnect wants to merge 1 commit into
KantConnect wants to merge 1 commit into
Conversation
Jens-G
approved these changes
May 14, 2026
| void t_js_generator::generate_deserialize_container(ostream& out, t_type* ttype, string prefix) { | ||
| string size = tmp("_size"); | ||
| string rtmp3 = tmp("_rtmp3"); | ||
| string seen; // populated only for sets, used for O(n) duplicate detection on read |
Member
There was a problem hiding this comment.
The comment seems off: O(n) or O(1) ?
a50f949 to
075c3e3
Compare
A Thrift set is defined to contain unique elements, but currently
different language implementations have handled duplicate elements
inconsistently: Go throws, Python silently deduplicates, while Node.js
passed duplicates through unchanged, breaking cross-language interop.
Add duplicate-element validation in the JavaScript code generator and
runtime libraries:
* lib/nodejs/lib/thrift/thrift.js, lib/js/src/thrift.js: new
Thrift.checkSetUniqueness(arr) helper. Duplicate detection via a
native JS Set; throws TProtocolException(INVALID_DATA) on the first
duplicate. Both the Node.js and the browser/ES6/TS runtime libraries
are updated, since the JS code generator emits the same
Thrift.checkSetUniqueness call regardless of target.
* compiler/cpp/src/thrift/generate/t_js_generator.cc: in
generate_serialize_container() emit a Thrift.checkSetUniqueness call
immediately before output.writeSetBegin(), so the validation runs
before any bytes hit the transport. In
generate_deserialize_container() for sets, allocate one parallel
`new Set()` before the per-element loop; in
generate_deserialize_set_element() guard the push with
seen.has(elem) / seen.add(elem).
* lib/nodejs/test/check_set_uniqueness.test.js,
lib/nodejs/test/testAll.sh: new tape unit tests covering unique
sets, duplicate sets across primitive types, empty/single-element
edge cases, and the thrown exception type.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
A Thrift set is defined to contain unique elements, but currently different language implementations have handled duplicate elements inconsistently: Go throws, Python silently deduplicates, while Node.js passed duplicates through unchanged, breaking cross-language interop.
Add duplicate-element validation in the JavaScript code generator and runtime libraries:
lib/nodejs/lib/thrift/thrift.js, lib/js/src/thrift.js: new Thrift.checkSetUniqueness(arr) helper. Duplicate detection via a native JS Set; throws TProtocolException(INVALID_DATA) on the first duplicate. Both the Node.js and the browser/ES6/TS runtime libraries are updated, since the JS code generator emits the same Thrift.checkSetUniqueness call regardless of target.
compiler/cpp/src/thrift/generate/t_js_generator.cc: in generate_serialize_container() emit a Thrift.checkSetUniqueness call immediately before output.writeSetBegin(), so the validation runs before any bytes hit the transport. In generate_deserialize_container() for sets, allocate one parallel
new Set()before the per-element loop; in generate_deserialize_set_element() guard the push with seen.has(elem) / seen.add(elem).lib/nodejs/test/check_set_uniqueness.test.js, lib/nodejs/test/testAll.sh: new tape unit tests covering unique sets, duplicate sets across primitive types, empty/single-element edge cases, and the thrown exception type.