@@ -49,14 +49,82 @@ including:
4949- Scientific notation (e.g. ` 1.23E+5 ` ) is supported.
5050- Special values (` inf ` , ` infinity ` , ` nan ` ) produce ` NULL ` .
5151
52+ ## String to Date
53+
54+ Comet's native ` CAST(string AS DATE) ` implementation matches Apache Spark's behavior for years
55+ between 262143 BC and 262142 AD. This range limitation comes from the underlying chrono library's
56+ ` NaiveDate ` type. Spark itself supports a wider range. All three eval modes (Legacy, ANSI, Try)
57+ are supported.
58+
59+ Supported input formats match Spark exactly:
60+
61+ - ` yyyy ` , ` yyyy-[m]m ` , ` yyyy-[m]m-[d]d `
62+ - Optional ` T ` suffix with arbitrary trailing text (e.g. ` 2020-01-01T12:34:56 ` )
63+ - Leading/trailing whitespace and control characters are trimmed
64+ - Optional sign prefix (` - ` for negative years)
65+ - Leading zeros (e.g. ` 0002020-01-01 ` is year 2020)
66+
67+ ## Date to Timestamp
68+
69+ Comet's native ` CAST(date AS TIMESTAMP) ` is compatible with Spark. The cast interprets each
70+ date as midnight in the session timezone and converts to a UTC epoch value. DST transitions
71+ are handled correctly, including spring-forward gaps (where midnight may not exist) and
72+ fall-back ambiguity (where Comet picks the earlier/DST occurrence, matching Spark's
73+ ` LocalDate.atStartOfDay(zoneId) ` behavior).
74+
75+ ## Date to TimestampNTZ
76+
77+ Comet's native ` CAST(date AS TIMESTAMP_NTZ) ` is compatible with Spark. The cast is
78+ timezone-independent: each date is converted to midnight as pure arithmetic
79+ (` days * 86,400,000,000 ` microseconds) with no session timezone offset applied. The result
80+ is the same regardless of the session timezone setting.
81+
82+ ## Date to Numeric Types
83+
84+ In Legacy mode, ` CAST(date AS INT) ` , ` CAST(date AS LONG) ` , and casts to all other numeric
85+ types (Boolean, Byte, Short, Float, Double, Decimal) always return ` NULL ` . Comet handles
86+ this by short-circuiting to a null literal during query planning, so no native execution
87+ is needed. In ANSI and Try modes, Spark rejects these casts at analysis time (before
88+ execution reaches Comet).
89+
5290## String to Timestamp
5391
5492Comet's native ` CAST(string AS TIMESTAMP) ` implementation supports all timestamp formats accepted
5593by Apache Spark, including ISO 8601 date-time strings, date-only strings, time-only strings
5694(` HH:MM:SS ` ), embedded timezone offsets (e.g. ` +07:30 ` , ` GMT-01:00 ` , ` UTC ` ), named timezone
5795suffixes (e.g. ` Europe/Moscow ` ), and the full Spark timestamp year range
58- (-290308 to 294247). Note that ` CAST(string AS DATE) ` is only compatible for years between
59- 262143 BC and 262142 AD due to an underlying library limitation.
96+ (-290308 to 294247).
97+
98+ ## String to TimestampNTZ
99+
100+ Comet's native ` CAST(string AS TIMESTAMP_NTZ) ` implementation matches Apache Spark's behavior.
101+ Unlike ` CAST(string AS TIMESTAMP) ` , this cast is timezone-independent: any timezone offset in
102+ the input string (e.g. ` +08:00 ` , ` Z ` , ` UTC ` ) is silently discarded, and the local date-time
103+ components are preserved as-is. Time-only strings (e.g. ` T12:34:56 ` , ` 12:34 ` ) produce ` NULL ` .
104+ The result is always a wall-clock timestamp with no timezone conversion or DST adjustment.
105+
106+ ## TimestampNTZ Casts
107+
108+ Comet supports the following ` TIMESTAMP_NTZ ` casts natively:
109+
110+ | Cast | Compatible | Notes |
111+ | ---------------------------------- | ---------- | ----------------------------------------------------------------- |
112+ | ` CAST(timestamp_ntz AS STRING) ` | Yes | Formats local time as-is, timezone-independent |
113+ | ` CAST(timestamp_ntz AS DATE) ` | Yes | Extracts the date component, timezone-independent |
114+ | ` CAST(timestamp_ntz AS TIMESTAMP) ` | Yes | Interprets NTZ as local time in session TZ, converts to UTC epoch |
115+ | ` CAST(date AS TIMESTAMP_NTZ) ` | Yes | Pure arithmetic, timezone-independent |
116+ | ` CAST(timestamp AS TIMESTAMP_NTZ) ` | Yes | Shifts UTC epoch to local time in session TZ |
117+ | ` CAST(string AS TIMESTAMP_NTZ) ` | Yes | See [ String to TimestampNTZ] ( #string-to-timestampntz ) above |
118+
119+ The NTZ-to-Timestamp and Timestamp-to-NTZ casts are session-timezone-dependent (the session
120+ timezone determines the UTC offset). All other NTZ casts are timezone-independent and produce
121+ the same result regardless of the session timezone.
122+
123+ ## Date to String
124+
125+ Comet's native ` CAST(date AS STRING) ` is compatible with Spark. Years below 1000 are
126+ zero-padded to four digits (e.g. year 999 renders as ` 0999-01-01 ` ). Years above 9999 are
127+ rendered without truncation. The cast is timezone-independent.
60128
61129## String to TimestampNTZ
62130
0 commit comments