chore: Improve shuffle fallback logic by andygrove · Pull Request #3989 · apache/datafusion-comet

andygrove · 2026-04-18T13:34:26Z

Which issue does this PR close?

Closes #3984

Rationale for this change

Main motivation is cleanup to improve on the approach used in fix: make shuffle fallback decisions sticky across planning passes #3982, which was implemented under time pressure

What changes are included in this PR?

Remove STAGE_FALLBACK_TAG since this was duplicating the existing withInfo mechanism for tagging plans with fallback reasons
Improve shuffle serde logic to avoid tagging the plan until native and columnar compatibility checks have both run
Improve documentation on withInfo methods (which should really be renamed to something like withFallbackReason, IMO)

How are these changes tested?

Existing tests, especially those added in #3982.

Replace the separate STAGE_FALLBACK_TAG with explain-info-based stickiness. Shuffle path checks (`nativeShuffleFailureReasons`, `columnarShuffleFailureReasons`) are now pure and return reasons instead of tagging eagerly. A new `shuffleSupported` coordinator short-circuits on `hasExplainInfo`, tries native then columnar, and tags via `withInfos` only on total failure. DPP fallback, which disqualifies both paths, moves into the coordinator. This removes the need for `CometFallback` and eliminates the semantic split where `withInfo` could fire for a path-specific failure while the node still converted via a different path.

…llup Expand the doc comments on withInfo/withInfos/hasExplainInfo to make clear that these record fallback reasons surfaced in extended explain output, and that any call to withInfo is a signal that the node falls back to Spark. Also restore the child-expression rollup for native range-partitioning sort orders that was lost in the earlier refactor: when exprToProto fails on a sort-order expression, its own fallback reasons (e.g. strict floating-point sort) are now copied onto the shuffle's reasons so they surface alongside 'unsupported range partitioning sort order'.

mbutrovich · 2026-04-20T14:04:33Z

Is this a pure refactor, or does it fix any bugs related to the fallback logic? Asked a different way: should this be a branch-15 backport candidate?

andygrove · 2026-04-20T14:19:29Z

Is this a pure refactor, or does it fix any bugs related to the fallback logic? Asked a different way: should this be a branch-15 backport candidate?

This is a pure refactor. No functional changes.

If I had been in less of a hurry to get a build out on Friday, this is how I would have implemented the fix for #3949

andygrove · 2026-04-20T14:42:38Z

Is this a pure refactor, or does it fix any bugs related to the fallback logic? Asked a different way: should this be a branch-15 backport candidate?

This is a pure refactor. No functional changes.

If I had been in less of a hurry to get a build out on Friday, this is how I would have implemented the fix for #3949

The reason why withInfo did not work initially is because of the way we were tagging shuffles when native shuffle was not supported, even though we did not fall back, because columnar shuffle was supported.

We now figure out all the reasons for both native and columnar before calling withInfo.

comphead · 2026-04-20T15:59:41Z

-    if (!isCometPlan(s.child)) {
-      // we do not need to report a fallback reason if the child plan is not a Comet plan
-      return false
+      reasons += "Comet native shuffle not enabled"


should we show an example to how to enable native shuffle?

They must have intentionally done it since it's enabled by default.

comphead · 2026-04-20T16:00:48Z

+        throw new IllegalStateException(
+          "shuffleSupported chose native shuffle but children are not all CometNativeExec")
+      case None =>
+        throw new IllegalStateException()


perhaps it is time to add some meaningful message here?

comphead

Thanks @andygrove overall it is LGTM

parthchandra · 2026-04-20T18:30:34Z

+    // shuffle falls back to Spark and tagged it. Preserve that decision - re-deriving it against
+    // a possibly-reshaped subtree (e.g. AQE stage-wrapping) can flip the answer and produce
+    // inconsistent plans across passes (see #3949).
+    if (hasExplainInfo(s)) return None


Not insisting on this but I actually think having a separate tag for fallback was better than using the info tags. Info is just that, a bit of information. Its presence or absence should not really be used for making decisions.

Currently, Comet only use withInfo to indicate that an operator should fall back, AFAIK.

What would be an example of adding explain info to an operator where we do not also want to fall back to Spark? I could see that it could be added to note minor incompatibilities.

We could add info, for instance, to show why Comet chose a hash join instead of a sort merge join (and maybe point the user to a config that allows them to override a comet decision). The idea is that explain info is information that goes with an explain plan and is essentially user facing.

I like the idea. Let me write up an issue

mbutrovich

Thanks @andygrove! Let's keep iterating in this area and improving the user experience!

andygrove added 2 commits April 18, 2026 07:29

andygrove changed the title ~~feat: Improve shuffle fallback logic and reduce planning overhead~~ feat: Improve shuffle fallback logic Apr 18, 2026

andygrove changed the title ~~feat: Improve shuffle fallback logic~~ chore: Improve shuffle fallback logic Apr 18, 2026

chore: trigger CI

cee98e3

andygrove marked this pull request as ready for review April 18, 2026 14:56

andygrove mentioned this pull request Apr 18, 2026

perf: Short-circuit operator serde [experimental] #3992

Draft

andygrove requested review from comphead, mbutrovich and parthchandra April 20, 2026 13:50

comphead reviewed Apr 20, 2026

View reviewed changes

comphead approved these changes Apr 20, 2026

View reviewed changes

parthchandra reviewed Apr 20, 2026

View reviewed changes

andygrove mentioned this pull request Apr 20, 2026

perf: avoid JVM shuffle when sandwiched between non-Comet operators [WIP] #4010

Draft

mbutrovich approved these changes Apr 20, 2026

View reviewed changes

mbutrovich merged commit 5efd972 into apache:main Apr 20, 2026
135 checks passed

Conversation

andygrove commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

mbutrovich commented Apr 20, 2026

Uh oh!

andygrove commented Apr 20, 2026

Uh oh!

andygrove commented Apr 20, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

comphead left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mbutrovich left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

andygrove commented Apr 18, 2026 •

edited

Loading