Skip to content

Various improvement in documentation and a decoding function#6383

Open
bmribler wants to merge 15 commits into
HDFGroup:developfrom
bmribler:improve_error_checking
Open

Various improvement in documentation and a decoding function#6383
bmribler wants to merge 15 commits into
HDFGroup:developfrom
bmribler:improve_error_checking

Conversation

@bmribler

Copy link
Copy Markdown
Collaborator

- Improves documentation on the type size when creating/accessing a compound datatype with no predefined struct (GH issue HDFGroup#5371)
- Provides better description of the min_meta_perc and min_raw_perc arguments in the H5Pset_page_buffer_size() (GH issue HDFGroup#5711)
- Adds error checkings to an internal decoding function
Comment thread hl/src/H5TBpublic.h Outdated
Comment thread hl/src/H5TBpublic.h Outdated
Comment thread hl/src/H5TBpublic.h Outdated
vchoi-hdfgroup
vchoi-hdfgroup previously approved these changes May 15, 2026
Comment thread src/H5Fsuper_cache.c Outdated
Comment thread hl/src/H5TBpublic.h Outdated
@bmribler bmribler added Component - C Library Core C library issues (usually in the src directory) Component - Documentation Doxygen, markdown, etc. labels Jun 2, 2026
@ajelenak ajelenak added this to the HDF5 2.2.0 milestone Jun 2, 2026
Comment thread hl/src/H5TBpublic.h
* identifier \p loc_id. The dataset is extended to hold the
* new records.
*
* \p type_size can be obtained with \c sizeof(), if the data is

@jhendersonHDF jhendersonHDF Jun 2, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my reading, for most of these functions type_size is used to specify the size of the memory datatype, which may not necessarily be the same size as the file type (for example, if writing from a packed memory compound type). From that, H5TBget_field_info() is probably not what applications should call here. I believe the same advice from above about adding the sizes of the fields plus any structure padding still applies here and for several or all of the functions below.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the question is, if the data is not stored in a C struct, how is it stored? If the application is manually moving bytes around in memory then it should know these values.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding from @jhendersonHDF's comment tells me that I should use the same description of type_size in H5TBmake_table() for all type_size entries. I'll go ahead with that unless @fortnern thinks otherwise, as your comment seemed to suggest.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine though I added a comment about the method specified for make_table - you only need to look at the field with the highest offset.

Comment thread src/H5Fsuper_cache.c Outdated
Comment thread src/H5Fsuper_cache.c Outdated
bmribler and others added 4 commits June 4, 2026 01:21
Comment thread hl/src/H5TBpublic.h
* \p dset_name attached to the object specified by the
* identifier loc_id.
*
* \p type_size can be obtained with \c sizeof(), if the data is

@fortnern fortnern Jun 4, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You really just need the last offset plus the size of the last type (and any padding bytes desired after the last member). This is assuming the fields are sorted by offset, otherwise use the highest offest and the size of the type for the field with the highest offset.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, though I was mostly going for simplicity and as you already mentioned if the structure is just an untyped block of bytes the size should generally already be known. I do think the API design and documentation leaves a bit to be desired and telling someone how to calculate the size of their in-memory structure here may just be adding more potential for confusion than necessary.

@bmribler bmribler Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about: "Otherwise, \p type_size should be calculated based on the highest offset in \p field_offset, the size of its corresponding datatype in \p field_types, and any padding bytes desired after that field."

@bmribler bmribler requested a review from fortnern June 8, 2026 20:35
@lrknox lrknox removed their request for review June 11, 2026 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Component - C Library Core C library issues (usually in the src directory) Component - Documentation Doxygen, markdown, etc.

Projects

Status: To be triaged

Development

Successfully merging this pull request may close these issues.

Improve documentation for creating a compound datatype with no predefined struct

5 participants