[SPARK-57250][SQL] Construct sub-microsecond timestamp typed literals with precision derived from fractional digits by MaxGekk · Pull Request #56306 · apache/spark

MaxGekk · 2026-06-03T16:28:01Z

What changes were proposed in this pull request?

Per the ANSI SQL standard (ISO/IEC 9075-2, Subclause 5.3, Syntax Rule 27), the fractional-seconds precision of a typed timestamp literal is the number of digits in its <seconds fraction>. This PR makes the typed timestamp literals TIMESTAMP '...', TIMESTAMP_NTZ '...', and TIMESTAMP_LTZ '...' construct nanosecond-capable values, deriving the precision p from the fractional-digit count, when the SQL config spark.sql.timestampNanosTypes.enabled is enabled:

7-9 fractional digits -> TimestampNTZNanosType(p) / TimestampLTZNanosType(p)
<= 6 fractional digits -> existing microsecond behavior (unchanged)
> 9 fractional digits -> new INVALID_TIMESTAMP_LITERAL_PRECISION parse error

Concretely:

Add SparkDateTimeUtils.fractionalSecondsDigits(String): Int to count the digits of the seconds fraction in a timestamp string.
Route AstBuilder.visitTypeConstructor for TIMESTAMP/TIMESTAMP_NTZ/TIMESTAMP_LTZ to the nanosecond parse helpers when the preview flag is on and the literal carries 7-9 fractional digits, preserving the existing NTZ/LTZ resolution rules for the bare TIMESTAMP keyword.
Add the INVALID_TIMESTAMP_LITERAL_PRECISION error condition and a QueryParsingErrors.timestampLiteralPrecisionExceedsMaxError factory for literals with more than 9 fractional-second digits.

Why are the changes needed?

To support standard-compliant sub-microsecond (nanosecond) timestamp typed literals as part of the nanosecond timestamp preview umbrella (SPARK-56822). The ANSI SQL literal grammar carries no explicit precision argument; the precision is implied by the number of fractional-second digits, so the parser must derive it from the literal text.

Does this PR introduce any user-facing change?

Yes, but only when the preview flag spark.sql.timestampNanosTypes.enabled is set to true. In that case a typed timestamp literal with 7-9 fractional-second digits now produces a nanosecond-precision value instead of being truncated to microseconds, and a literal with more than 9 fractional digits raises INVALID_TIMESTAMP_LITERAL_PRECISION. With the flag off (the default), behavior is unchanged: literals are parsed at microsecond precision as before.

How was this patch tested?

Added cases to ExpressionParserSuite ("SPARK-57250: nanosecond timestamp typed literals") covering:

7/8/9-digit fractional precision for TIMESTAMP_NTZ, TIMESTAMP_LTZ, and bare TIMESTAMP;
NTZ/LTZ resolution based on the keyword and the presence of a time-zone part;
boundary values (pre-epoch, max year, nanosWithinMicro 0 and 999);
the 6-digit-stays-microsecond regression;
the >9 fractional-digit error;
the flag-off regression (microsecond behavior preserved).

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor (Claude Opus 4.8)

… with precision derived from fractional digits ### What changes were proposed in this pull request? Per ANSI SQL (ISO/IEC 9075-2, Subclause 5.3, Syntax Rule 27), the fractional-seconds precision of a typed timestamp literal is the number of digits in its `<seconds fraction>`. This PR makes `TIMESTAMP`, `TIMESTAMP_NTZ`, and `TIMESTAMP_LTZ` typed literals construct nanosecond-capable values with precision `p` derived from the fractional-digit count when the nanosecond preview (`spark.sql.timestampNanosTypes.enabled`) is on: - 7-9 fractional digits -> `TimestampNTZNanosType(p)` / `TimestampLTZNanosType(p)` - <= 6 fractional digits -> existing microsecond behavior (unchanged) - > 9 fractional digits -> `INVALID_TIMESTAMP_LITERAL_PRECISION` parse error A new `fractionalSecondsDigits` helper counts the digits of the seconds fraction, and `AstBuilder.visitTypeConstructor` routes to the nanos parse helpers accordingly. ### Why are the changes needed? To support standard-compliant sub-microsecond timestamp literals as part of the nanosecond timestamp preview (SPARK-56822). ### Does this PR introduce any user-facing change? Yes, but only when the nanosecond preview flag is enabled. With the flag off, behavior is unchanged. ### How was this patch tested? Added cases to `ExpressionParserSuite` covering 7/8/9-digit NTZ/LTZ/TIMESTAMP literals, zone resolution, boundary values, the 6-digit-stays-micro regression, the `>9`-digit error, and the flag-off regression.

…ll-through test cases Co-authored-by: Isaac

…actional-digit literals Co-authored-by: Isaac

MaxGekk · 2026-06-03T20:42:58Z

@srielau May I ask you to review this PR when you have time.

MaxGekk · 2026-06-04T09:31:12Z

@stevomitric @uros-b Could you review this PR, please.

uros-b · 2026-06-04T14:31:31Z

+   * `<seconds fraction>`). Digits beyond the fractional run (e.g. a trailing time zone) are not
+   * counted.
+   */
+  def fractionalSecondsDigits(s: String): Int = {


What if this gets something like abcd.1234?

Good question. fractionalSecondsDigits is intentionally a lightweight digit counter that runs before the full timestamp parse, so it doesn't validate the rest of the string itself. For abcd.1234 it returns 4, which is < 7, so nanosLiteralOpt returns None and we fall through to the existing microsecond path -- which then fails to parse abcd.1234 and throws the usual "cannot parse value" error, exactly as on master. The digit count only decides which parse path (micro vs. nanos vs. >9 error) we take, and every path re-parses and validates the whole string.

The one nuance: a malformed value with more than 9 digits after the first . (e.g. abcd.1234567890) short-circuits to INVALID_TIMESTAMP_LITERAL_PRECISION instead of the generic parse error. It's still an error, and the value does literally carry >9 fractional digits, so I kept the current behavior -- happy to gate the precision check behind a successful parse if you'd prefer the generic error there.

I've clarified the doc comment to state this "pre-parse counter, validated downstream" contract and added direct unit tests for fractionalSecondsDigits (no fraction, trailing time zone, non-digit stop, the abcd.1234 case, and the digit-count boundaries).

stevomitric

LGTM.

Two small coverage suggestions, neither blocking:

Add a flag-off regression for >9 fractional digits, since the implementation intentionally keeps legacy truncation when the preview flag is disabled.
Consider direct unit coverage for fractionalSecondsDigits, especially no fraction, timezone suffix, and non-digit stop cases.

…digit regression Address review feedback on apache#56306: - Add direct unit coverage for `fractionalSecondsDigits` in `DateTimeUtilsSuite` (no fraction, trailing dot, 1/6/7/9/10-digit boundaries, non-digit stop cases such as a trailing time zone/whitespace, multi-dot, and malformed inputs). - Add a flag-off regression in `ExpressionParserSuite` asserting that a literal with more than 9 fractional digits narrows to microseconds (instead of raising INVALID_TIMESTAMP_LITERAL_PRECISION) when the nanosecond preview is disabled. - Clarify the `fractionalSecondsDigits` doc comment to state that it is a lightweight pre-parse digit counter and that malformed inputs are rejected downstream by the chosen parse path.

MaxGekk · 2026-06-05T01:59:05Z

Thanks @stevomitric! Both made sense and neither was covered, so I added them:

A flag-off regression with >9 fractional digits asserting the literal narrows to microseconds (...00.1234567890 -> ...00.123456) instead of raising INVALID_TIMESTAMP_LITERAL_PRECISION, locking in the intentional flag-gated validation.
Direct unit tests for fractionalSecondsDigits in DateTimeUtilsSuite, covering no fraction, a trailing time-zone suffix, a non-digit stop (e.g. abcd.1234), the multi-dot case, and the 0/6/7/9/10-digit boundaries.

MaxGekk added 3 commits June 3, 2026 18:24

[SPARK-57250][SQL] Add TIMESTAMP_LTZ-with-offset and special-value fa…

a8991c3

…ll-through test cases Co-authored-by: Isaac

[SPARK-57250][SQL] Document flag-gated validation asymmetry for >9 fr…

00a587e

…actional-digit literals Co-authored-by: Isaac

uros-b reviewed Jun 4, 2026

View reviewed changes

stevomitric reviewed Jun 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-57250][SQL] Construct sub-microsecond timestamp typed literals with precision derived from fractional digits#56306

[SPARK-57250][SQL] Construct sub-microsecond timestamp typed literals with precision derived from fractional digits#56306
MaxGekk wants to merge 4 commits into
apache:masterfrom
MaxGekk:spark-57250-typed-literals

MaxGekk commented Jun 3, 2026 •

edited

Loading

Uh oh!

MaxGekk commented Jun 3, 2026

Uh oh!

MaxGekk commented Jun 4, 2026

Uh oh!

uros-b Jun 4, 2026

Uh oh!

MaxGekk Jun 5, 2026

Uh oh!

stevomitric left a comment

Uh oh!

MaxGekk commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

MaxGekk commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

MaxGekk commented Jun 3, 2026

Uh oh!

MaxGekk commented Jun 4, 2026

Uh oh!

uros-b Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

MaxGekk Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

stevomitric left a comment

Choose a reason for hiding this comment

Uh oh!

MaxGekk commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MaxGekk commented Jun 3, 2026 •

edited

Loading