Skip to content

SOLR-18237: Do not copy data before noggit parsing#4415

Open
psalagnac wants to merge 1 commit into
apache:mainfrom
psalagnac:SOLR-18237-no-buffer
Open

SOLR-18237: Do not copy data before noggit parsing#4415
psalagnac wants to merge 1 commit into
apache:mainfrom
psalagnac:SOLR-18237-no-buffer

Conversation

@psalagnac
Copy link
Copy Markdown
Contributor

https://issues.apache.org/jira/browse/SOLR-18237

This removes the char[] array buffer when parsing collection state (and other Json data). Instead, the UTF8 decoding is done on the fly using JVM tolling (which I guess will be faster).
By default, the JSON parser has an internal buffer of 8K.

This removes the buffer with full data copy when deserializing
collection state from Zookeeper data.
@epugh
Copy link
Copy Markdown
Contributor

epugh commented May 12, 2026

this looks like it make sense, however it would be awesome if we had a micro benchmark here?

@dsmiley dsmiley self-requested a review May 14, 2026 02:11
@@ -0,0 +1,7 @@
title: Do not copy full data for UTF8 to Java string conversion for deserializing a collection state
type: fixed
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

users will not perceive this as a fix of a bug. It's "changed". But I'm skeptical it warrants a changelog at all.

Comment on lines +297 to +299
// convert from bytes to chars on-the-fly and parse directly
// from that instead of going through intermediate buffers
Reader reader = new InputStreamReader(new ByteArrayInputStream(utf8, offset, length), UTF_8);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is definitely the most natural/clear way to do things instead of anything else. Maybe it's faster; I trust your judgement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants