fix: allow safe mixed Spark/Comet partial/final aggregate execution#4015
Draft
andygrove wants to merge 5 commits intoapache:mainfrom
Draft
fix: allow safe mixed Spark/Comet partial/final aggregate execution#4015andygrove wants to merge 5 commits intoapache:mainfrom
andygrove wants to merge 5 commits intoapache:mainfrom
Conversation
Previously, when one aggregate stage (Partial or Final) couldn't be converted to Comet, the other was also blocked to avoid crashes from incompatible intermediate buffer formats (issues apache#1389, apache#1267). This change introduces per-aggregate `supportsMixedPartialFinal` declarations so that aggregates with simple, compatible buffers (MIN, MAX, COUNT, bitwise) can safely run in mixed mode while unsafe aggregates (SUM, AVG, Variance, CollectSet) continue to be blocked. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Restore `convert` scaladoc in `CometAggregateExpressionSerde` that was displaced when `supportsMixedPartialFinal` was added - Require `aggregateExpressions.nonEmpty` in `findPartialAggInPlan` so intermediate distinct-elimination stages (empty agg, group-by only) are not incorrectly tagged as the Partial to disable - Document that `canFinalAggregateBeConverted` mirrors the predicate checks in `CometBaseAggregate.doConvert` and must be kept in sync
Member
Author
|
@Shekharrajak This PR draws some inspiration from #2994. Thanks for the early work towards this. |
If the corresponding partial aggregate would also fail conversion to Comet (for example, collect_set on float is incompatible), tagging it early hijacks the more specific natural fallback reason. Only tag the partial when it would otherwise have been converted, so the tag guards genuine buffer-format mismatches rather than masking unrelated fallbacks. Generalize the convertibility predicate to accept an expected mode and mirror the mode-specific result-expression handling in doConvert.
… files findPartialAggInPlan was using a deep tree traversal that matched partial aggregates separated from the final by other aggregate stages. For Spark's distinct-aggregate rewrite, the partial for non-distinct aggs feeds into a PartialMerge stage rather than directly into the final, so tagging it as unsafe is incorrect and hijacks the natural 'Unsupported aggregation mode PartialMerge' fallback reason. Walk only through exchanges and AQE stages. Also regenerate TPC-DS plan-stability golden files for Spark 3.4, 3.5, and 4.0 to reflect the branch's new safe-mixed-execution behavior where the final aggregate converts to Comet when all aggregate functions have compatible intermediate buffer formats.
4b5d992 to
753a9a5
Compare
Arrow's row format, used by DataFusion's grouped hash aggregate for
composite group keys, does not support Map at any nesting level. The
existing guard in CometBaseAggregate.doConvert only matched top-level
MapType, so queries grouping by e.g. array<map<int,int>> crashed with
"Row format support not yet implemented for: [SortField { ... List(Map(...)) }]"
once the new mixed-partial-final path produced a Comet Final aggregate
over Spark-partial output.
Add a recursive QueryPlanSerde.containsMapType helper that walks into
ArrayType and StructType, and use it in both doConvert and
canAggregateBeConverted. Add a regression test exercising the failing
group-by.sql query shape from SQLQueryTestSuite.
Contributor
|
might be related to #4003 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #1389.
Part of #1267.
Rationale for this change
When a Spark query's Final-mode aggregate cannot be converted to Comet (for example because its result expressions are not supported, as in
concat(flatten(collect_set(col)))), Comet would still convert the upstream Partial-mode aggregate. The Partial produces intermediate buffers in a format the Spark Final cannot interpret (different encodings forCollectSet,Average, decimalSum, variance, etc.), which crashes at runtime with errors such asNot supported on CometListVector.Conversely, most aggregates block even a safe Spark-Partial + Comet-Final combination, where the buffer formats are in fact compatible (
MIN,MAX,COUNT, bitwise).This change prevents the crash for unsafe aggregates and unlocks the mixed execution for the safe ones.
This PR improves Comet native coverage for TPC-DS.
What changes are included in this PR?
supportsMixedPartialFinalflag onCometAggregateExpressionSerde, defaulting tofalse. Set totrueforMIN,MAX,COUNT,BitAndAgg,BitOrAgg,BitXorAgg, whose intermediate buffer formats match between Spark and Comet.QueryPlanSerde.allAggsSupportMixedExecutionchecks the flag across an aggregate's expressions.CometExecRule.tagUnsafePartialAggregatesruns before bottom-up transformation. For each Final-mode aggregate whose expressions are not all mixed-safe, it conservatively checks whether the Final itself is convertible via the newcanFinalAggregateBeConverted(mirrors the predicates inCometBaseAggregate.doConvert). If not, the corresponding Partial (looked up byfindPartialAggInPlan, traversing throughAQEShuffleReadExecandShuffleQueryStageExec) is tagged withCOMET_UNSAFE_PARTIAL.CometBaseAggregate.doConverthonours the new tag, and now permits the Spark-Partial + Comet-Final case when all aggregates are mixed-safe.How are these changes tested?
CometExecRuleSuite:SUM(unsafe) is un-ignored; asserts neither side is converted.SUM; asserts neither side is converted.MIN/MAX/COUNT; asserts partial converts to Comet, final stays Spark.MIN/MAX/COUNT; asserts partial stays Spark, final converts to Comet.