[VL][Delta] Add native Delta bitmap aggregation support#12214
Open
malinjawi wants to merge 4 commits into
Open
[VL][Delta] Add native Delta bitmap aggregation support#12214malinjawi wants to merge 4 commits into
malinjawi wants to merge 4 commits into
Conversation
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes are proposed in this pull request?
This PR is the next split for Delta deletion-vector MoR support. It adds the native bitmap primitive needed by later DELETE DV work, without changing DELETE routing or enabling native bitmap construction in the command path yet.
Main changes:
RoaringBitmapArrayfor Delta Portable-format deletion-vector payloadsreadSafebitmapaggregatorsupport for Delta row-index aggregationdelta_bitmap_benchmarkwith construction, partial-merge, and deserialize/probe casesThis PR is intentionally primitive-only:
Those pieces remain in follow-up split PRs after the primitive and benchmark shape are reviewed.
How was this patch tested?
Post-rebase validation on top of current
upstream/main(33be6fb8bf703ac16eae3c75efa919a97d9cdf5a):git diff --check upstream/main...HEADenv JAVA_HOME=/opt/homebrew/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home PATH=/opt/homebrew/opt/openjdk@17/bin:$PATH ./build/mvn test-compile -pl backends-velox -am -Pjava-17,spark-3.5,backends-velox,hadoop-3.3,spark-ut,delta -DskipTestsenv JAVA_HOME=/opt/homebrew/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home PATH=/opt/homebrew/opt/openjdk@17/bin:$PATH ./build/mvn test-compile -pl backends-velox -am -Pjava-17,spark-4.0,scala-2.13,backends-velox,hadoop-3.3,spark-ut,delta -DskipTestsFocused standalone native validation from the same diff before the final rebase:
RoaringBitmapArrayTest: passed all 9 focused tests1,7, and1 << 33is read by native code; native compact portable payload for the same values is read by a Delta 3.3.2 JVM helper with cardinality3, all expected contains checks, and last value8589934592delta_bitmap_benchmarkconstruction/merge output:/tmp/delta_bitmap_benchmark_delete_construction.jsondelta_bitmap_benchmarkread/probe output:/tmp/delta_bitmap_benchmark_read_probe.jsonBenchmark highlights from the standalone run:
7.91 ms,132.5M rows/s9.99 ms,105.0M rows/s10.10 ms,103.9M rows/s2.28 ms,114.9M rows/s1.12 ms1.32 ms487 usfor an 8,192-probe sampleCI status:
Notes:
Was this patch authored or co-authored using generative AI tooling?
Generated-by: IBM BOB