Skip to content

Fix bad reads of nested cmpd/vlen types in JNI#6413

Open
mattjala wants to merge 16 commits into
HDFGroup:developfrom
mattjala:java_vlen_read_220
Open

Fix bad reads of nested cmpd/vlen types in JNI#6413
mattjala wants to merge 16 commits into
HDFGroup:developfrom
mattjala:java_vlen_read_220

Conversation

@mattjala

@mattjala mattjala commented May 20, 2026

Copy link
Copy Markdown
Contributor
  • Fixed H5DreadVL failing to read data for pre-allocated compound-of-vlen datasets

When H5DreadVL is called with a H5T_COMPOUND memory type whose members include variable-length data, and the caller pre-allocates each row slot in the outer Object[] buffer, then the read completes without throwing an error, but each pre-allocated row ArrayList comes back empty.

If the caller follows the alternate calling pattern, leaving each row slot null and letting the native code allocate the per-row record, then the same read works correctly.

The root cause is the H5T_COMPOUND case in translate_rbuf. This function walks each row's compound, decodes each member into a Java object via translate_atomic_rbuf, and stores the result in the per-row ArrayList. It branches on a found_jList flag based on the result of GetObjectArrayElement(), i.e. whether the caller pre-allocated the row slot.

Later on, this flag is used to decide between using the .add() method of ArrayList (if found_jList is false), or trying to directly set an element on a builtin java array (if found_jList is true).

The latter case is wrong in two ways.

  1. Type mismatch. jList is always an ArrayList, either fetched from ret_buf[i] (where each slot is an ArrayList) or newly created a few lines earlier. The attempt to directly set an element using the built-in array syntax silently no-ops.
  2. Wrong index variable. Even if jList were a Java array, the index supplied is i (the outer row counter), not x (the inner member counter).

There are currently no tests covering H5DreadVL() for H5T_COMPOUND in the src-jni test tree. This was discovered in the context of HDFView attempting to read compound-of-sequence datasets. HDFView's H5Datatype.allocateArray pre-fills each row slot with new ArrayList<>() before handing the buffer to H5DreadVL, which triggered the bug.

My fix is to replace the use the ArrayList.add() method in every case, and drop the found_jList tracking since jList is always an ArrayList and we always want to use the .add() method. This also causes it to match the pattern of the H5T_VLEN branch, which works for both pre-allocated and non-pre-allocated slots.

To test this case, I added test.TestH5D.testH5Dread_compound_of_vlen_preallocated. For the sake of regression testing, I also added testH5Dread_compound_of_vlen which tests the same thing with non-pre-allocated row slots, although this test passed even before I made any changes to the source code.

I also fixed a very similar problem in H5DwriteVL that led to a SIGSEGV when writing compound of variable-length sequences. Specifically, translate_atomic_wbuf treats in_obj as a java array when it is an ArrayList, producing a garbage length value that eventually leads to a crash.

I changed the H5T_VLEN case to behave similarly to the H5T_ARRAY and H5T_COMPOUND cases and use mToArray on the provided ArrayList to get an array object for the array operations. The test verifying the fix is testH5Dwrite_compound_of_vlen.


translate_rbuf's H5T_VLEN case had a similar bug where when found_jList was set to false due to an entry in ret_buf being null. ret_buf.add() would then be invoked on an array of objects without the list .add() method. This would occur whenever a read was invoked of a vlen sequence with a null (non-preallocated) entry. The pre-existing tests only tested the pre-allocated cases.

I removed the use of the found_jList flag, since it conflated the passing of an unallocated slot with ret_buf not being an array. Instead, it now uses ret_buflen == 0 as the check to match the pattern in H5T_INTEGER and other branches.

The test for this fix is testH5Dread_vlen_of_compound_nullslot.


translate_atomic_rebuf had two issues related to handling of nested compounds. First, it discarded recursive return values, resulting in the construction of empty lists. Secondly, its member offset (char_buf + i * typeSize + memb_offset) was incorrect. In this case, i was the member index and memberSize was the entire cmpd size, so the offset would be erroneously large. It seems like this came from copying of the offset computation from translate_rbuf, which had to advance over entire elements of compound data. This error was duplicated on the write side in translate_atomic_wbuf's H5T_COMPOUND case (h5util.c:4611).

I changed translate_atomic_rbuf to capture the resultant object, and dropped the i * typeSize term in both routines.
The new test verifying the fix is testH5Dread_vlen_of_nested_compound.

Fixes #6467

@mattjala mattjala requested a review from jhendersonHDF as a code owner May 20, 2026 14:45
Copilot AI review requested due to automatic review settings May 20, 2026 14:45
@mattjala mattjala added the Component - Wrappers C++, Java & Fortran wrappers label May 20, 2026
@github-project-automation github-project-automation Bot moved this to To be triaged in HDF5 - TRIAGE & TRACK May 20, 2026

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes multiple JNI translation bugs affecting H5DreadVL/H5DwriteVL for nested datatype combinations (notably compound members containing VLEN, and VLEN of nested compounds), which previously could yield empty Java ArrayList results or crash (SIGSEGV). It also adds regression tests in the Java JNI test suite to cover these cases.

Changes:

  • Fix compound read translation to always append members via ArrayList.add() (instead of incorrectly attempting array element assignment when the caller pre-allocates row slots).
  • Fix VLEN write translation to convert ArrayList inputs via toArray() before performing array-style operations.
  • Add new JUnit tests for compound-of-VLEN reads (preallocated and non-preallocated), VLEN-of-compound reads with null slots, and VLEN-of-nested-compound reads; update expected test output.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
java/src-jni/test/TestH5D.java Adds helper + new regression tests covering nested compound/VLEN read/write scenarios.
java/src-jni/test/testfiles/JUnit-TestH5D.txt Updates expected JUnit output list/count for the added tests.
java/src-jni/jni/h5util.c Adjusts JNI buffer translation logic for compound/VLEN read/write paths to correctly use ArrayList semantics and correct member offsets.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread java/src-jni/jni/h5util.c
Comment thread java/src-jni/jni/h5util.c Outdated
@mattjala mattjala force-pushed the java_vlen_read_220 branch from e65f573 to af4e258 Compare May 20, 2026 17:44
@brtnfld brtnfld added the HDFG-internal Internally coded for use by the HDF Group label May 22, 2026
@brtnfld brtnfld moved this from To be triaged to In progress in HDF5 - TRIAGE & TRACK May 22, 2026
@mattjala mattjala added this to the HDF5 2.2.0 milestone May 26, 2026
@mattjala mattjala force-pushed the java_vlen_read_220 branch 2 times, most recently from 1837946 to 2adb07a Compare May 26, 2026 16:25
@jhendersonHDF

jhendersonHDF commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

After looking over this for a while, I have several comments, some related to the changes in this PR and some unrelated:

  • I believe support for the pre-allocated variable-length sequence case should be entirely removed. It was clearly not implemented correctly, it isn't advertised anywhere in example code as far as I can tell and it simply complicates the logic with a pattern that isn't used anywhere else in the API (callers don't pre-allocate buffers for reads of variable-length data in the C API). I believe trying to support this pattern came about in HDFGroup/hdfview@06e888c when changing attributes to be represented with real byte data rather than all strings. If HDFView is making use of this pattern, it should probably be changed.
  • translate_rbuf() and translate_wbuf() should have included the size of the buffer to prevent mistakes such as the indexing issue with char_buf + i * typeSize + memb_offset.
  • Due to many of these cases working previously in HDFView 3.1.1 (the only one I've tested so far), I believe this is a case of the data model contract between the JNI (HDF5 side of things) and the object library (HDFView/Java side of things) becoming broken at some point. From that, it seems to me that the JNI code should be doing a much more thorough job of verifying, immediately at the API level, that the buffer given to it for reading/writing is in the expected (and documented somewhere) format. These mismatches are responsible for display issues/crashes in HDFView the vast majority of the time. While complex compound types could incur some overhead for this processing, I believe it's worth it to pay that price in order to be able to catch problems before they cause a crash.
  • It's apparent that automated UI testing of HDFView with SWTBot needs to be setup somewhere once again, as it would have caught all of these problems the moment things broke. This testing stopped when testing was moved to GH CI, but there are many test files under https://github.com/HDFGroup/hdfview/tree/master/hdfview/src/test/resources/uitest that were previously used to catch problems.

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Checklist

This PR touches the following areas. Each needs a sign-off
from its listed owners before merging.

  • java

mattjala and others added 11 commits June 16, 2026 11:22
`translate_rbuf`'s H5T_VLEN case had a similar bug where when `found_jList` was set to false due to an entyr in `ret_buf` being null, `ret_buf.add()` would be invoked on an array of objects without the list .add() method. This would occur whenever a read was invoked of a vlen sequence with a null (non-preallocated) entry. The pre-existing tests only tested the pre-allocated cases.

I removed the use of the `found_jList` flag, since it conflated the passing of an unallocated slot with `ret_buf` not being an array. Instead use `ret_buflen == 0` as the check to match the pattern in H5T_INTEGER and other branches.

The test for this fix is testH5Dread_vlen_of_compound_nullslot.

---

`translate_atomic_rebuf` had two issues related to handling of nested compounds. First, it discarded recursive returns, resulting in the construction of empty lists. Secondly, its member offset (`char_buf + i * typeSize + memb_offset`) was incorrect. In this case, `i` was the member index and `memberSize` was the entire cmpd size, so the offset would be erroneously large. It seems like this came from copying of the offset computation from `translate_rbuf`, which had to advance over entire  elements of compound data. This error was duplicated on the write side in `translate_atomic_wbuf`'s H5T_COMPOUND case (h5util.c:4611).

I changed `translate_atomic_rbuf` to capture the resultant object, and dropped the `i * typeSize` term in both routines.
The new test verifying the fix works is `testH5Dread_vlen_of_nested_compound`.
@mattjala mattjala force-pushed the java_vlen_read_220 branch from 54f0c88 to 8c5c5f4 Compare June 16, 2026 16:22
@nbagha1

nbagha1 commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

@jhendersonHDF - 10294

@jhendersonHDF jhendersonHDF left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple minor comments but I think this is a solid improvement over the previous logic. The comments about future changes can be ignored for now, but should probably be dealt with eventually. There are also a few minor CI failures to be fixed.

Comment thread java/src-jni/jni/h5util.c
/* Convert element to a vlen element */
hvl_t vl_elem;

jsize jnelmts = ENVPTR->GetArrayLength(ENVONLY, in_obj);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a future minor optimization, if it's known that this will be a flat ArrayList, just calling .size() would be a little better than converting to an array just to get the length. This also looks like another case of losing track of / not validating the expected format of the buffer before operating on it.

Comment thread java/src-jni/jni/h5util.c
if (mToArray == NULL)
CHECK_JNI_EXCEPTION(ENVONLY, JNI_FALSE);
jobjectArray array = (jobjectArray)ENVPTR->CallObjectMethod(ENVONLY, in_obj, mToArray);
jsize jnelmts = ENVPTR->GetArrayLength(ENVONLY, array);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another future note, this particular case looks like the data model should be changed, as Object[] should cover this case without needing to involve ArrayList.

* and the parameter <i>data</i> can be any multi-dimensional array of numbers, such as float[][], or
* int[][][], or Double[][].
* <p>
* <b>Buffer data model</b>

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also leave a similar header comment on the translate_wbuf() / translate_rbuf() functions so that the data model is documented right where the data is transformed? For those working in the JNI that will probably be easier to find.

Comment thread java/src-jni/jni/h5util.c
*
* Caller-side pre-allocation of per-element slots is not supported.
* Any object already present in a Java array slot is overwritten. */
retIsList = ENVPTR->IsInstanceOf(ENVONLY, ret_buf, arrCList);

@jhendersonHDF jhendersonHDF Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future note: ideally, this should be split out so that translate_rbuf() isn't called recursively and doesn't add to the type confusion; separate methods should exist for separate types, even if those only shape the data into something appropriate for a common helper function.

Comment thread java/src-jni/jni/h5util.c
}
case H5T_INTEGER:
case H5T_ENUM:
case H5T_BITFIELD:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opaque types probably need to be left out of this particular verification or the checks need changed for them. Since they can be arbitrary size, you're not likely to find a matching Java type for them (they're just written as a byte[] for each element). It looks like translate_atomic_wbuf() is handling this incorrectly as well. I believe the logic should be the same as for H5T_REFERENCE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Component - Wrappers C++, Java & Fortran wrappers HDFG-internal Internally coded for use by the HDF Group

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

JNI mishandles compound and variable-length datatypes

5 participants