Skip to content

fix: allow safe mixed Spark/Comet partial/final aggregate execution#4015

Draft
andygrove wants to merge 5 commits intoapache:mainfrom
andygrove:fix/safe-mixed-partial-final-aggregates
Draft

fix: allow safe mixed Spark/Comet partial/final aggregate execution#4015
andygrove wants to merge 5 commits intoapache:mainfrom
andygrove:fix/safe-mixed-partial-final-aggregates

Conversation

@andygrove
Copy link
Copy Markdown
Member

@andygrove andygrove commented Apr 21, 2026

Which issue does this PR close?

Closes #1389.

Part of #1267.

Rationale for this change

When a Spark query's Final-mode aggregate cannot be converted to Comet (for example because its result expressions are not supported, as in concat(flatten(collect_set(col)))), Comet would still convert the upstream Partial-mode aggregate. The Partial produces intermediate buffers in a format the Spark Final cannot interpret (different encodings for CollectSet, Average, decimal Sum, variance, etc.), which crashes at runtime with errors such as Not supported on CometListVector.

Conversely, most aggregates block even a safe Spark-Partial + Comet-Final combination, where the buffer formats are in fact compatible (MIN, MAX, COUNT, bitwise).

This change prevents the crash for unsafe aggregates and unlocks the mixed execution for the safe ones.

This PR improves Comet native coverage for TPC-DS.

What changes are included in this PR?

  • New supportsMixedPartialFinal flag on CometAggregateExpressionSerde, defaulting to false. Set to true for MIN, MAX, COUNT, BitAndAgg, BitOrAgg, BitXorAgg, whose intermediate buffer formats match between Spark and Comet.
  • QueryPlanSerde.allAggsSupportMixedExecution checks the flag across an aggregate's expressions.
  • CometExecRule.tagUnsafePartialAggregates runs before bottom-up transformation. For each Final-mode aggregate whose expressions are not all mixed-safe, it conservatively checks whether the Final itself is convertible via the new canFinalAggregateBeConverted (mirrors the predicates in CometBaseAggregate.doConvert). If not, the corresponding Partial (looked up by findPartialAggInPlan, traversing through AQEShuffleReadExec and ShuffleQueryStageExec) is tagged with COMET_UNSAFE_PARTIAL.
  • CometBaseAggregate.doConvert honours the new tag, and now permits the Spark-Partial + Comet-Final case when all aggregates are mixed-safe.

How are these changes tested?

CometExecRuleSuite:

  • Existing test for Comet-Partial + Spark-Final with SUM (unsafe) is un-ignored; asserts neither side is converted.
  • New test for Spark-Partial + Comet-Final with SUM; asserts neither side is converted.
  • New test for Comet-Partial + Spark-Final with MIN/MAX/COUNT; asserts partial converts to Comet, final stays Spark.
  • New test for Spark-Partial + Comet-Final with MIN/MAX/COUNT; asserts partial stays Spark, final converts to Comet.

andygrove and others added 2 commits April 20, 2026 21:15
Previously, when one aggregate stage (Partial or Final) couldn't be
converted to Comet, the other was also blocked to avoid crashes from
incompatible intermediate buffer formats (issues apache#1389, apache#1267).

This change introduces per-aggregate `supportsMixedPartialFinal` declarations
so that aggregates with simple, compatible buffers (MIN, MAX, COUNT, bitwise)
can safely run in mixed mode while unsafe aggregates (SUM, AVG, Variance,
CollectSet) continue to be blocked.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Restore `convert` scaladoc in `CometAggregateExpressionSerde` that was
  displaced when `supportsMixedPartialFinal` was added
- Require `aggregateExpressions.nonEmpty` in `findPartialAggInPlan` so
  intermediate distinct-elimination stages (empty agg, group-by only)
  are not incorrectly tagged as the Partial to disable
- Document that `canFinalAggregateBeConverted` mirrors the predicate
  checks in `CometBaseAggregate.doConvert` and must be kept in sync
@andygrove
Copy link
Copy Markdown
Member Author

@Shekharrajak This PR draws some inspiration from #2994. Thanks for the early work towards this.

If the corresponding partial aggregate would also fail conversion to Comet
(for example, collect_set on float is incompatible), tagging it early
hijacks the more specific natural fallback reason. Only tag the partial
when it would otherwise have been converted, so the tag guards genuine
buffer-format mismatches rather than masking unrelated fallbacks.

Generalize the convertibility predicate to accept an expected mode and
mirror the mode-specific result-expression handling in doConvert.
… files

findPartialAggInPlan was using a deep tree traversal that matched partial
aggregates separated from the final by other aggregate stages. For Spark's
distinct-aggregate rewrite, the partial for non-distinct aggs feeds into a
PartialMerge stage rather than directly into the final, so tagging it as
unsafe is incorrect and hijacks the natural 'Unsupported aggregation mode
PartialMerge' fallback reason. Walk only through exchanges and AQE stages.

Also regenerate TPC-DS plan-stability golden files for Spark 3.4, 3.5, and
4.0 to reflect the branch's new safe-mixed-execution behavior where the
final aggregate converts to Comet when all aggregate functions have
compatible intermediate buffer formats.
@andygrove andygrove force-pushed the fix/safe-mixed-partial-final-aggregates branch from 4b5d992 to 753a9a5 Compare April 21, 2026 13:08
@andygrove andygrove added performance bug Something isn't working labels Apr 21, 2026
Arrow's row format, used by DataFusion's grouped hash aggregate for
composite group keys, does not support Map at any nesting level. The
existing guard in CometBaseAggregate.doConvert only matched top-level
MapType, so queries grouping by e.g. array<map<int,int>> crashed with
"Row format support not yet implemented for: [SortField { ... List(Map(...)) }]"
once the new mixed-partial-final path produced a Comet Final aggregate
over Spark-partial output.

Add a recursive QueryPlanSerde.containsMapType helper that walks into
ArrayType and StructType, and use it in both doConvert and
canAggregateBeConverted. Add a regression test exercising the failing
group-by.sql query shape from SQLQueryTestSuite.
@comphead
Copy link
Copy Markdown
Contributor

might be related to #4003

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AQE may materialize a non-supported Final-mode HashAggregate

2 participants