Skip to content

[CALCITE-7511] Route RelShuttle dispatch through type-specific visit overloads#4928

Open
venkata91 wants to merge 4 commits into
apache:mainfrom
venkata91:vsowrira/calcite-7511-table-function-scan-accept
Open

[CALCITE-7511] Route RelShuttle dispatch through type-specific visit overloads#4928
venkata91 wants to merge 4 commits into
apache:mainfrom
venkata91:vsowrira/calcite-7511-table-function-scan-accept

Conversation

@venkata91
Copy link
Copy Markdown

@venkata91 venkata91 commented May 8, 2026

Jira Link

CALCITE-7511

Changes Proposed

Adds accept(RelShuttle) overrides on rel classes that previously fell through
AbstractRelNode.accept(RelShuttle) to visit(RelNode), plus matching type-specific
RelShuttle.visit(X) overloads (with default impls in RelShuttleImpl and forwarding
overrides in RelHomogeneousShuttle). Mirrors the existing pattern on TableScan.

Affected rel classes (10):

  • Core: TableFunctionScan, Window, Snapshot, Collect, Sample, Uncollect,
    Combine, ConditionalCorrelate, SortExchange, TableSpool
  • The accept(RelShuttle) override is placed on the abstract parent where one exists,
    so all subclasses (not just Logical*) dispatch through the type-specific overload.
    For example, EnumerableTableFunctionScan now also routes through visit(TableFunctionScan).

Also adds the previously-missing RelHomogeneousShuttle.visit(LogicalAsofJoin) forwarding
override; subclasses that relied on LogicalAsofJoin not being routed through their
visit(RelNode) override will now see it routed there, matching every other rel type
in the homogeneous shuttle.

Migrates SqlHintsConverterTest.HintCollector and ToLogicalConverter off instanceof
workarounds inside visit(RelNode) for the affected types where applicable.

Adds RelShuttleCoverageTest as a regression guard: reflectively scans concrete rels in
org.apache.calcite.rel.core / .logical and asserts every one is covered by a
non-visit(RelNode) overload on RelShuttle. Also asserts every RelShuttle.visit(X)
parameter type declares its own accept(RelShuttle).

Documented as a breaking change in site/_docs/history.md (see prior discussion on
CALCITE-7288 / #4620): callers implementing RelShuttle directly must add the new
methods; subclasses of RelShuttleImpl that handled any of these types via instanceof
checks in visit(RelNode) should migrate to the type-specific overrides.

@venkata91 venkata91 force-pushed the vsowrira/calcite-7511-table-function-scan-accept branch from 0348fb1 to 0df0e15 Compare May 8, 2026 22:32
Copy link
Copy Markdown
Contributor

@mihaibudiu mihaibudiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine. I wonder whether not handling TableFunctionScan was an omission or was intentional.

@venkata91
Copy link
Copy Markdown
Author

venkata91 commented May 9, 2026

This looks fine. I wonder whether not handling TableFunctionScan was an omission or was intentional.

Thanks for the review, @mihaibudiu! It looks like an omission, not intentional:

  • The sibling TableScan does have the override (TableScan.java:180), but TableFunctionScan doesn't — no semantic reason to differ.
  • AbstractRelNode.accept(RelShuttle)'s doc explicitly states the override is expected wherever a corresponding RelShuttle.visit(...) exists (and it does for TableFunctionScan).
  • LogicalWindow and LogicalSnapshot are missing it too. SqlHintsConverterTest.HintCollector had a visit(RelNode) + instanceof workaround covering exactly those three — looks like accumulated workarounds for the same gap.

I encountered this while working adding support for projection pushdown with UNNEST in Flink. See apache/flink#28127

Without this fix, any rule that walks a Correlate's right subtree with a RexShuttle to renumber $cor0.X references (e.g.
ProjectCorrelateTransposeRule.RelNodesExprsHandler after pruning the left input) silently skips the TableFunction's rexCall — dispatch routes through visit(RelNode) instead of visit(TableFunctionScan). The stale field index survives the rewrite, and at runtime the TableFunction reads the wrong source column.

This blocks correct projection pushdown through Correlate-over-TFS shapes such as Flink's UNNEST.

@mihaibudiu
Copy link
Copy Markdown
Contributor

Can you ask in Jira why Window and Snapshot are missing?
Maybe you can try also to see whether there was a discussion about this.
Why not add them all now?

@mihaibudiu
Copy link
Copy Markdown
Contributor

Actually, this may be a breaking change for the ones who have implemented visitors with workarounds.
That's why it doesn't exist.

@asolimando
Copy link
Copy Markdown
Member

Actually, this may be a breaking change for the ones who have implemented visitors with workarounds. That's why it doesn't exist.

I think this relates to https://issues.apache.org/jira/browse/CALCITE-7288, I'd like to loop in @dssysolyatin to see if this new example could revive the discussion around CALCITE-7288. It would be great to link the two tickets in Jira too. (I would do it but I am on vacation at the moment with limited connectivity)

@dssysolyatin
Copy link
Copy Markdown
Contributor

I'd like to loop in @dssysolyatin to see if this new example could revive the discussion

Thanks for looping me in. Feel free to run with it in any direction. I've already shared my opinion in the task and PR discussion (#4620), which should provide the relevant context. Unfortunately, I don't have the bandwidth to revisit/revive the discussion right now; even small Calcite tasks tend to take me about a month to resolve at this point in my life. :)

@xiedeyantu
Copy link
Copy Markdown
Member

I have linked this JIRA to CALCITE-7288. However, this issue seems to remain unresolved. We have currently identified two such problems; should we accept a breaking change? I ask because I suspect that, in the short term at least, no one is likely to devise and implement a perfect logical solution.

@venkata91
Copy link
Copy Markdown
Author

venkata91 commented May 12, 2026

Thanks @mihaibudiu @asolimando, @xiedeyantu, @dssysolyatin. I read through CALCITE-7288 and the prior discussion on #4620/#4641, and I understand the concern: without a spec + test, every gap fix is a fresh breaking change for consumers with instanceof workarounds.

That said, the gap isn't theoretical for my use case, the missing dispatch produces silently incorrect runtime results in Flink's UNNEST projection pushdown.

Two options I'd be happy to pursue, both documenting the breaking change in the release notes following the pattern from #4620:
A. Expand this PR to cover all three rels (LogicalTableFunctionScan, LogicalWindow, LogicalSnapshot) => the same three that SqlHintsConverterTest.HintCollector's instanceof workaround targets. We can rip the band-aid off by landing the breaking change once across all known gaps rather than spreading it over multiple releases.
B. Keep this PR scoped to LogicalTableFunctionScan only — narrowest blast radius, addresses the concrete bug, defers Window/Snapshot to follow-up JIRAs.

A is more efficient if we're accepting the breaking change anyway; B is safer if we want to tackle the Window/Snapshot under the broader CALCITE-7288 effort. Either way, I'd like to keep the discussion of a long-term spec/test on CALCITE-7288 rather than blocking this one.

What do you prefer?

@mihaibudiu
Copy link
Copy Markdown
Contributor

I personally would not mind implementing all of them and documenting the breaking change. That's how it's supposed to work. If the upgrade path for users is not difficult, we should describe it for users in the release notes.

However, Calcite uses semantic versioning, so this kind of change should be officially prohibited. But I don't see any Calcite 2.0 release on the horizon, so I don't think this can be postponed indefinitely.

@venkata91
Copy link
Copy Markdown
Author

I personally would not mind implementing all of them and documenting the breaking change. That's how it's supposed to work. If the upgrade path for users is not difficult, we should describe it for users in the release notes.

However, Calcite uses semantic versioning, so this kind of change should be officially prohibited. But I don't see any Calcite 2.0 release on the horizon, so I don't think this can be postponed indefinitely.

Okay in that case, let me fix it for the other Nodes as well LogicalWindow and LogicalSnapshot and add a test to prevent it from getting missed later. But this doesn't prevent users from using instanceof approach later right even after we document this as a breaking change in Release Notes?

@mihaibudiu
Copy link
Copy Markdown
Contributor

This is only my preference, other people can weigh on this matter differently.
I don't expect this change will be a lot of work, so perhaps you can update it anyway.

@venkata91 venkata91 force-pushed the vsowrira/calcite-7511-table-function-scan-accept branch from 0df0e15 to 1701be1 Compare May 12, 2026 19:00
@venkata91 venkata91 changed the title [CALCITE-7511] LogicalTableFunctionScan should override accept(RelShuttle) so dispatch routes through RelShuttle.visit(TableFunctionScan) [CALCITE-7511] TableFunctionScan, Window, and Snapshot should override accept(RelShuttle) so dispatch routes through type-specific RelShuttle.visit overloads May 12, 2026
@venkata91 venkata91 force-pushed the vsowrira/calcite-7511-table-function-scan-accept branch 5 times, most recently from 1aed4e4 to bfc7b79 Compare May 12, 2026 22:26
@venkata91
Copy link
Copy Markdown
Author

venkata91 commented May 12, 2026

@mihaibudiu As suggested, updated the fix to fix other RelNodes as well like Snapshot and Window with a reflection-based test to prevent it from happening in the future. Please review.

@xiedeyantu
Copy link
Copy Markdown
Member

I personally agree with Mihai's point of view; indefinite postponement is not a good approach. Unless we can reach a different consensus in a short time, I personally suggest that this be stated in the upgrade documentation.

…e accept(RelShuttle) so dispatch routes through type-specific RelShuttle.visit overloads
@venkata91 venkata91 force-pushed the vsowrira/calcite-7511-table-function-scan-accept branch from bfc7b79 to ff4a2ec Compare May 12, 2026 23:47
@venkata91
Copy link
Copy Markdown
Author

I personally agree with Mihai's point of view; indefinite postponement is not a good approach. Unless we can reach a different consensus in a short time, I personally suggest that this be stated in the upgrade documentation.

Thanks @xiedeyantu. The PR already includes a breaking-change entry under 1.42.0 → Breaking Changes in site/_docs/history.md (link), covering which RelShuttle callers are affected and how to migrate (instanceof branches in visit(RelNode) → type-specific visit(Window) / visit(Snapshot) / visit(TableFunctionScan) overrides). Let me know if you'd like anything added or framed differently there.

Comment thread site/_docs/history.md Outdated
(1) Callers that implement `RelShuttle` directly must implement the two new methods.
(2) Callers that subclassed `RelShuttleImpl` and routed `Window` / `Snapshot` /
`TableFunctionScan` through `visit(RelNode other)` with `instanceof` checks should
migrate to the type-specific `visit(Window)` / `visit(Snapshot)` /
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this include a pointer to the code that is migrated in this PR?
I guess the commit is not yet final, so you cannot point to a commit. Maybe we have to do this in a subsequent PR.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that will be better.


/** Pre-existing gaps that predate this safety net. Each should be addressed in its own follow-up
* JIRA before being removed from this list. */
private static final Set<String> KNOWN_UNCOVERED_RELS =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why aren't we fixing these too? Is it too disruptive?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I felt it was expanding the scope substantially. On a second thought, I decided to bite the bullet by fixing it for those other UNCOVERED_RELS as well. I will shortly update the PR.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated it for the other UNCOVERED_RELS as well. Please review.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before merging this let's send a message to the dev list warning people about it, and giving them a chance to comment. Do you want to send it?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I can send about this breaking change in dev list.

…, Uncollect, Combine, ConditionalCorrelate, SortExchange, and TableSpool

Adds the same dispatch-routing fix to the remaining rel types skip-listed in
RelShuttleCoverageTest. Each gets accept(RelShuttle) on its concrete class
(or abstract parent where one exists), a matching RelShuttle.visit(X) overload,
default visitChildren impl in RelShuttleImpl, and forwarding to
visit((RelNode) x) in RelHomogeneousShuttle.

ToLogicalConverter.visit(Uncollect) forwards to visit((RelNode) uncollect),
mirroring the existing Window override; the instanceof Uncollect branch in
visit(RelNode) is preserved.

RelShuttleCoverageTest no longer needs the KNOWN_UNCOVERED_RELS skip-list or
its reverse-assertion. Also switches SCANNED_PACKAGES to ImmutableSet.of for
JDK 8 compatibility.
@venkata91 venkata91 changed the title [CALCITE-7511] TableFunctionScan, Window, and Snapshot should override accept(RelShuttle) so dispatch routes through type-specific RelShuttle.visit overloads [CALCITE-7511] Route RelShuttle dispatch through type-specific visit overloads May 13, 2026
venkata91 added 2 commits May 13, 2026 13:43
…ds-incompatible and add migration checklist

Restructures the upgrade-doc entry from dense prose into a labeled
"Backwards-incompatible." marker plus a "Migration required if:" bulleted
checklist covering the three consumer scenarios (direct RelShuttle impls,
RelShuttleImpl subclasses with instanceof workarounds, and the
abstract-parent vs Logical* dispatch distinction).

Adds a TODO comment flagging that the target release version is still
pending confirmation with the committer; entry may not land in 1.42.0.
…d by other Breaking Changes entries

Drops the inline "Backwards-incompatible." marker and the bulleted
"Migration required if:" checklist; restores the established prose
style used by past entries (e.g., CALCITE-7029 / CALCITE-7125 /
CALCITE-5716 in 1.41.0). The #### Breaking Changes section header
already conveys the backwards-incompatible status.

Also removes the TODO comment about version assignment; entry stays
under 1.42.0 (committer can move it if a later release is targeted).
@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown
Contributor

@mihaibudiu mihaibudiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's wait to merge this until we give a chance to other contributors to comment or at least ack the breaking change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants