You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Comet supports the following Spark expressions. Expressions that are marked as Spark-compatible will either run
natively in Comet and provide the same results as Spark, or will fall back to Spark for cases that would not
be compatible.
All expressions are enabled by default, but most can be disabled by setting
spark.comet.expression.EXPRNAME.enabled=false, where EXPRNAME is the expression name as specified in
the following tables, such as Length, or StartsWith. See the Comet Configuration Guide for a full list
of expressions that be disabled.
Expressions that are not Spark-compatible will fall back to Spark by default and can be enabled by setting
spark.comet.expression.EXPRNAME.allowIncompatible=true.
Conditional Expressions
Expression
SQL
Spark-Compatible?
CaseWhen
CASE WHEN expr THEN expr ELSE expr END
Yes
If
IF(predicate_expr, true_expr, false_expr)
Yes
Predicate Expressions
Expression
SQL
Spark-Compatible?
And
AND
Yes
EqualTo
=
Yes
EqualNullSafe
<=>
Yes
GreaterThan
>
Yes
GreaterThanOrEqual
>=
Yes
LessThan
<
Yes
LessThanOrEqual
<=
Yes
In
IN
Yes
IsNotNull
IS NOT NULL
Yes
IsNull
IS NULL
Yes
InSet
IN (...)
Yes
Not
NOT
Yes
Or
OR
Yes
String Functions
Expression
Spark-Compatible?
Compatibility Notes
Ascii
Yes
BitLength
Yes
Chr
Yes
Concat
Yes
Only string inputs are supported
ConcatWs
Yes
Contains
Yes
EndsWith
Yes
InitCap
No
Behavior is different in some cases, such as hyphenated names.
Left
Yes
Length argument must be a literal value
Length
Yes
Like
Yes
Lower
No
Results can vary depending on locale and character set. Requires spark.comet.caseConversion.enabled=true
OctetLength
Yes
Reverse
Yes
RLike
No
Uses Rust regexp engine, which has different behavior to Java regexp engine
StartsWith
Yes
StringInstr
Yes
StringRepeat
Yes
Negative argument for number of times to repeat causes exception
StringReplace
Yes
StringLPad
Yes
StringRPad
Yes
StringSpace
Yes
StringTranslate
Yes
StringTrim
Yes
StringTrimBoth
Yes
StringTrimLeft
Yes
StringTrimRight
Yes
Substring
Yes
Upper
No
Results can vary depending on locale and character set. Requires spark.comet.caseConversion.enabled=true
JSON Functions
Expression
Spark-Compatible?
Compatibility Notes
GetJsonObject
No
Spark allows single-quoted JSON and unescaped control characters which Comet does not support
Date/Time Functions
Expression
SQL
Spark-Compatible?
Compatibility Notes
DateAdd
date_add
Yes
DateDiff
datediff
Yes
DateFormat
date_format
Yes
Partial support. Only specific format patterns are supported.
DateSub
date_sub
Yes
DatePart
date_part(field, source)
Yes
Supported values of field: year/month/week/day/dayofweek/dayofweek_iso/doy/quarter/hour/minute
Days
days
Yes
V2 partition transform. Supports DateType and TimestampType inputs.
Extract
extract(field FROM source)
Yes
Supported values of field: year/month/week/day/dayofweek/dayofweek_iso/doy/quarter/hour/minute
FromUnixTime
from_unixtime
No
Does not support format, supports only -8334601211038 <= sec <= 8210266876799
Hour
hour
No
Incorrectly applies timezone conversion to TimestampNTZ inputs (#3180)
LastDay
last_day
Yes
Minute
minute
No
Incorrectly applies timezone conversion to TimestampNTZ inputs (#3180)
Second
second
No
Incorrectly applies timezone conversion to TimestampNTZ inputs (#3180)
NaN dedup differs from Spark. See compatibility guide.
Corr
Yes
Count
Yes
CovPopulation
Yes
CovSample
Yes
First
No
This function is not deterministic. Results may not match Spark.
Last
No
This function is not deterministic. Results may not match Spark.
Max
Yes
Min
Yes
StddevPop
Yes
StddevSamp
Yes
Sum
Yes, except for ANSI mode
VariancePop
Yes
VarianceSamp
Yes
Window Functions
Window support is disabled by default due to known correctness issues. Tracking issue: [#2721](https://github.com/apache/datafusion-comet/issues/2721).
Comet supports using the following aggregate functions within window contexts with PARTITION BY and ORDER BY clauses.
Expression
Spark-Compatible?
Compatibility Notes
Count
Yes
Max
Yes
Min
Yes
Sum
Yes
Note: Dedicated window functions such as rank, dense_rank, row_number, lag, lead, ntile, cume_dist, percent_rank, and nth_value are not currently supported and will fall back to Spark.
Array Expressions
Expression
Spark-Compatible?
Compatibility Notes
ArrayAppend
Yes
ArrayCompact
No
ArrayContains
Yes
ArrayDistinct
Yes
ArrayExcept
No
ArrayFilter
Yes
Only supports case where function is IsNotNull
ArrayInsert
No
ArrayIntersect
No
ArrayJoin
No
ArrayMax
Yes
ArrayMin
Yes
ArrayRemove
Yes
ArrayRepeat
No
ArrayUnion
No
Behaves differently than spark. Comet sorts the input arrays before performing the union, while Spark preserves the order of the first array and appends unique elements from the second.
ArraysOverlap
Yes
CreateArray
Yes
ElementAt
Yes
Input must be an array. Map inputs are not supported.
Flatten
Yes
GetArrayItem
Yes
Map Expressions
Expression
Spark-Compatible?
GetMapValue
Yes
MapKeys
Yes
MapEntries
Yes
MapValues
Yes
MapFromArrays
Yes
Struct Expressions
Expression
Spark-Compatible?
Compatibility Notes
CreateNamedStruct
Yes
GetArrayStructFields
Yes
GetStructField
Yes
JsonToStructs
No
Partial support. Requires explicit schema.
StructsToJson
No
Does not support Infinity/-Infinity for numeric types (#3016)