Skip to content

Commit de633c8

Browse files
andygroveclaude
andauthored
fix: [iceberg] Disable native c2r by default (#3348)
* Revert "chore: Enable native c2r in plan stability suite (#3302)" This reverts commit 313cf64. * revert native c2r by default * fix * fix * trigger CI * Add .gitattributes to collapse golden file diffs on GitHub Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Revert "Add .gitattributes to collapse golden file diffs on GitHub" This reverts commit 2fc85d1. * Fix withInfo test to not depend on CometNativeColumnarToRowExec Use collectFirst to find CometProjectExec in the plan tree instead of casting through CometNativeColumnarToRowExec, which is no longer present when native columnar-to-row is disabled. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * upmerge and fix imports --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
1 parent f83f51c commit de633c8

3,282 files changed

Lines changed: 51105 additions & 46327 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

common/src/main/scala/org/apache/comet/CometConf.scala

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -304,9 +304,9 @@ object CometConf extends ShimCometConf {
304304
"Whether to enable native columnar to row conversion. When enabled, Comet will use " +
305305
"native Rust code to convert Arrow columnar data to Spark UnsafeRow format instead " +
306306
"of the JVM implementation. This can improve performance for queries that need to " +
307-
"convert between columnar and row formats.")
307+
"convert between columnar and row formats. This is an experimental feature.")
308308
.booleanConf
309-
.createWithDefault(true)
309+
.createWithDefault(false)
310310

311311
val COMET_EXEC_SORT_MERGE_JOIN_WITH_JOIN_FILTER_ENABLED: ConfigEntry[Boolean] =
312312
conf("spark.comet.exec.sortMergeJoinWithJoinFilter.enabled")

spark/src/test/resources/tpcds-plan-stability/approved-plans-v1_4-spark3_5/q1.native_datafusion/explain.txt

Lines changed: 34 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ TakeOrderedAndProject (44)
88
: : +- * BroadcastHashJoin Inner BuildRight (28)
99
: : :- * Filter (11)
1010
: : : +- * HashAggregate (10)
11-
: : : +- CometNativeColumnarToRow (9)
11+
: : : +- * CometColumnarToRow (9)
1212
: : : +- CometColumnarExchange (8)
1313
: : : +- * HashAggregate (7)
1414
: : : +- * Project (6)
@@ -20,11 +20,11 @@ TakeOrderedAndProject (44)
2020
: : +- BroadcastExchange (27)
2121
: : +- * Filter (26)
2222
: : +- * HashAggregate (25)
23-
: : +- CometNativeColumnarToRow (24)
23+
: : +- * CometColumnarToRow (24)
2424
: : +- CometColumnarExchange (23)
2525
: : +- * HashAggregate (22)
2626
: : +- * HashAggregate (21)
27-
: : +- CometNativeColumnarToRow (20)
27+
: : +- * CometColumnarToRow (20)
2828
: : +- CometColumnarExchange (19)
2929
: : +- * HashAggregate (18)
3030
: : +- * Project (17)
@@ -34,12 +34,12 @@ TakeOrderedAndProject (44)
3434
: : : +- Scan parquet spark_catalog.default.store_returns (12)
3535
: : +- ReusedExchange (15)
3636
: +- BroadcastExchange (34)
37-
: +- CometNativeColumnarToRow (33)
37+
: +- * CometColumnarToRow (33)
3838
: +- CometProject (32)
3939
: +- CometFilter (31)
4040
: +- CometNativeScan parquet spark_catalog.default.store (30)
4141
+- BroadcastExchange (41)
42-
+- CometNativeColumnarToRow (40)
42+
+- * CometColumnarToRow (40)
4343
+- CometProject (39)
4444
+- CometFilter (38)
4545
+- CometNativeScan parquet spark_catalog.default.customer (37)
@@ -53,27 +53,27 @@ PartitionFilters: [isnotnull(sr_returned_date_sk#4), dynamicpruningexpression(sr
5353
PushedFilters: [IsNotNull(sr_store_sk), IsNotNull(sr_customer_sk)]
5454
ReadSchema: struct<sr_customer_sk:int,sr_store_sk:int,sr_return_amt:decimal(7,2)>
5555

56-
(2) ColumnarToRow [codegen id : 1]
56+
(2) ColumnarToRow [codegen id : 2]
5757
Input [4]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4]
5858

59-
(3) Filter [codegen id : 1]
59+
(3) Filter [codegen id : 2]
6060
Input [4]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4]
6161
Condition : (isnotnull(sr_store_sk#2) AND isnotnull(sr_customer_sk#1))
6262

6363
(4) ReusedExchange [Reuses operator id: 49]
6464
Output [1]: [d_date_sk#6]
6565

66-
(5) BroadcastHashJoin [codegen id : 1]
66+
(5) BroadcastHashJoin [codegen id : 2]
6767
Left keys [1]: [sr_returned_date_sk#4]
6868
Right keys [1]: [d_date_sk#6]
6969
Join type: Inner
7070
Join condition: None
7171

72-
(6) Project [codegen id : 1]
72+
(6) Project [codegen id : 2]
7373
Output [3]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3]
7474
Input [5]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4, d_date_sk#6]
7575

76-
(7) HashAggregate [codegen id : 1]
76+
(7) HashAggregate [codegen id : 2]
7777
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3]
7878
Keys [2]: [sr_customer_sk#1, sr_store_sk#2]
7979
Functions [1]: [partial_sum(UnscaledValue(sr_return_amt#3))]
@@ -84,17 +84,17 @@ Results [3]: [sr_customer_sk#1, sr_store_sk#2, sum#8]
8484
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sum#8]
8585
Arguments: hashpartitioning(sr_customer_sk#1, sr_store_sk#2, 5), ENSURE_REQUIREMENTS, CometColumnarShuffle, [plan_id=1]
8686

87-
(9) CometNativeColumnarToRow
87+
(9) CometColumnarToRow [codegen id : 9]
8888
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sum#8]
8989

90-
(10) HashAggregate [codegen id : 5]
90+
(10) HashAggregate [codegen id : 9]
9191
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sum#8]
9292
Keys [2]: [sr_customer_sk#1, sr_store_sk#2]
9393
Functions [1]: [sum(UnscaledValue(sr_return_amt#3))]
9494
Aggregate Attributes [1]: [sum(UnscaledValue(sr_return_amt#3))#9]
9595
Results [3]: [sr_customer_sk#1 AS ctr_customer_sk#10, sr_store_sk#2 AS ctr_store_sk#11, MakeDecimal(sum(UnscaledValue(sr_return_amt#3))#9,17,2) AS ctr_total_return#12]
9696

97-
(11) Filter [codegen id : 5]
97+
(11) Filter [codegen id : 9]
9898
Input [3]: [ctr_customer_sk#10, ctr_store_sk#11, ctr_total_return#12]
9999
Condition : isnotnull(ctr_total_return#12)
100100

@@ -106,27 +106,27 @@ PartitionFilters: [isnotnull(sr_returned_date_sk#16), dynamicpruningexpression(s
106106
PushedFilters: [IsNotNull(sr_store_sk)]
107107
ReadSchema: struct<sr_customer_sk:int,sr_store_sk:int,sr_return_amt:decimal(7,2)>
108108

109-
(13) ColumnarToRow [codegen id : 2]
109+
(13) ColumnarToRow [codegen id : 4]
110110
Input [4]: [sr_customer_sk#13, sr_store_sk#14, sr_return_amt#15, sr_returned_date_sk#16]
111111

112-
(14) Filter [codegen id : 2]
112+
(14) Filter [codegen id : 4]
113113
Input [4]: [sr_customer_sk#13, sr_store_sk#14, sr_return_amt#15, sr_returned_date_sk#16]
114114
Condition : isnotnull(sr_store_sk#14)
115115

116116
(15) ReusedExchange [Reuses operator id: 49]
117117
Output [1]: [d_date_sk#17]
118118

119-
(16) BroadcastHashJoin [codegen id : 2]
119+
(16) BroadcastHashJoin [codegen id : 4]
120120
Left keys [1]: [sr_returned_date_sk#16]
121121
Right keys [1]: [d_date_sk#17]
122122
Join type: Inner
123123
Join condition: None
124124

125-
(17) Project [codegen id : 2]
125+
(17) Project [codegen id : 4]
126126
Output [3]: [sr_customer_sk#13, sr_store_sk#14, sr_return_amt#15]
127127
Input [5]: [sr_customer_sk#13, sr_store_sk#14, sr_return_amt#15, sr_returned_date_sk#16, d_date_sk#17]
128128

129-
(18) HashAggregate [codegen id : 2]
129+
(18) HashAggregate [codegen id : 4]
130130
Input [3]: [sr_customer_sk#13, sr_store_sk#14, sr_return_amt#15]
131131
Keys [2]: [sr_customer_sk#13, sr_store_sk#14]
132132
Functions [1]: [partial_sum(UnscaledValue(sr_return_amt#15))]
@@ -137,17 +137,17 @@ Results [3]: [sr_customer_sk#13, sr_store_sk#14, sum#19]
137137
Input [3]: [sr_customer_sk#13, sr_store_sk#14, sum#19]
138138
Arguments: hashpartitioning(sr_customer_sk#13, sr_store_sk#14, 5), ENSURE_REQUIREMENTS, CometColumnarShuffle, [plan_id=2]
139139

140-
(20) CometNativeColumnarToRow
140+
(20) CometColumnarToRow [codegen id : 5]
141141
Input [3]: [sr_customer_sk#13, sr_store_sk#14, sum#19]
142142

143-
(21) HashAggregate [codegen id : 3]
143+
(21) HashAggregate [codegen id : 5]
144144
Input [3]: [sr_customer_sk#13, sr_store_sk#14, sum#19]
145145
Keys [2]: [sr_customer_sk#13, sr_store_sk#14]
146146
Functions [1]: [sum(UnscaledValue(sr_return_amt#15))]
147147
Aggregate Attributes [1]: [sum(UnscaledValue(sr_return_amt#15))#9]
148148
Results [2]: [sr_store_sk#14 AS ctr_store_sk#20, MakeDecimal(sum(UnscaledValue(sr_return_amt#15))#9,17,2) AS ctr_total_return#21]
149149

150-
(22) HashAggregate [codegen id : 3]
150+
(22) HashAggregate [codegen id : 5]
151151
Input [2]: [ctr_store_sk#20, ctr_total_return#21]
152152
Keys [1]: [ctr_store_sk#20]
153153
Functions [1]: [partial_avg(ctr_total_return#21)]
@@ -158,31 +158,31 @@ Results [3]: [ctr_store_sk#20, sum#24, count#25]
158158
Input [3]: [ctr_store_sk#20, sum#24, count#25]
159159
Arguments: hashpartitioning(ctr_store_sk#20, 5), ENSURE_REQUIREMENTS, CometColumnarShuffle, [plan_id=3]
160160

161-
(24) CometNativeColumnarToRow
161+
(24) CometColumnarToRow [codegen id : 6]
162162
Input [3]: [ctr_store_sk#20, sum#24, count#25]
163163

164-
(25) HashAggregate [codegen id : 4]
164+
(25) HashAggregate [codegen id : 6]
165165
Input [3]: [ctr_store_sk#20, sum#24, count#25]
166166
Keys [1]: [ctr_store_sk#20]
167167
Functions [1]: [avg(ctr_total_return#21)]
168168
Aggregate Attributes [1]: [avg(ctr_total_return#21)#26]
169169
Results [2]: [(avg(ctr_total_return#21)#26 * 1.2) AS (avg(ctr_total_return) * 1.2)#27, ctr_store_sk#20]
170170

171-
(26) Filter [codegen id : 4]
171+
(26) Filter [codegen id : 6]
172172
Input [2]: [(avg(ctr_total_return) * 1.2)#27, ctr_store_sk#20]
173173
Condition : isnotnull((avg(ctr_total_return) * 1.2)#27)
174174

175175
(27) BroadcastExchange
176176
Input [2]: [(avg(ctr_total_return) * 1.2)#27, ctr_store_sk#20]
177177
Arguments: HashedRelationBroadcastMode(List(cast(input[1, int, true] as bigint)),false), [plan_id=4]
178178

179-
(28) BroadcastHashJoin [codegen id : 5]
179+
(28) BroadcastHashJoin [codegen id : 9]
180180
Left keys [1]: [ctr_store_sk#11]
181181
Right keys [1]: [ctr_store_sk#20]
182182
Join type: Inner
183183
Join condition: (cast(ctr_total_return#12 as decimal(24,7)) > (avg(ctr_total_return) * 1.2)#27)
184184

185-
(29) Project [codegen id : 5]
185+
(29) Project [codegen id : 9]
186186
Output [2]: [ctr_customer_sk#10, ctr_store_sk#11]
187187
Input [5]: [ctr_customer_sk#10, ctr_store_sk#11, ctr_total_return#12, (avg(ctr_total_return) * 1.2)#27, ctr_store_sk#20]
188188

@@ -201,20 +201,20 @@ Condition : ((staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharC
201201
Input [2]: [s_store_sk#28, s_state#29]
202202
Arguments: [s_store_sk#28], [s_store_sk#28]
203203

204-
(33) CometNativeColumnarToRow
204+
(33) CometColumnarToRow [codegen id : 7]
205205
Input [1]: [s_store_sk#28]
206206

207207
(34) BroadcastExchange
208208
Input [1]: [s_store_sk#28]
209209
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=5]
210210

211-
(35) BroadcastHashJoin [codegen id : 5]
211+
(35) BroadcastHashJoin [codegen id : 9]
212212
Left keys [1]: [ctr_store_sk#11]
213213
Right keys [1]: [s_store_sk#28]
214214
Join type: Inner
215215
Join condition: None
216216

217-
(36) Project [codegen id : 5]
217+
(36) Project [codegen id : 9]
218218
Output [1]: [ctr_customer_sk#10]
219219
Input [3]: [ctr_customer_sk#10, ctr_store_sk#11, s_store_sk#28]
220220

@@ -233,20 +233,20 @@ Condition : isnotnull(c_customer_sk#30)
233233
Input [2]: [c_customer_sk#30, c_customer_id#31]
234234
Arguments: [c_customer_sk#30, c_customer_id#32], [c_customer_sk#30, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c_customer_id#31, 16, true, false, true) AS c_customer_id#32]
235235

236-
(40) CometNativeColumnarToRow
236+
(40) CometColumnarToRow [codegen id : 8]
237237
Input [2]: [c_customer_sk#30, c_customer_id#32]
238238

239239
(41) BroadcastExchange
240240
Input [2]: [c_customer_sk#30, c_customer_id#32]
241241
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=6]
242242

243-
(42) BroadcastHashJoin [codegen id : 5]
243+
(42) BroadcastHashJoin [codegen id : 9]
244244
Left keys [1]: [ctr_customer_sk#10]
245245
Right keys [1]: [c_customer_sk#30]
246246
Join type: Inner
247247
Join condition: None
248248

249-
(43) Project [codegen id : 5]
249+
(43) Project [codegen id : 9]
250250
Output [1]: [c_customer_id#32]
251251
Input [3]: [ctr_customer_sk#10, c_customer_sk#30, c_customer_id#32]
252252

@@ -258,7 +258,7 @@ Arguments: 100, [c_customer_id#32 ASC NULLS FIRST], [c_customer_id#32]
258258

259259
Subquery:1 Hosting operator id = 1 Hosting Expression = sr_returned_date_sk#4 IN dynamicpruning#5
260260
BroadcastExchange (49)
261-
+- CometNativeColumnarToRow (48)
261+
+- * CometColumnarToRow (48)
262262
+- CometProject (47)
263263
+- CometFilter (46)
264264
+- CometNativeScan parquet spark_catalog.default.date_dim (45)
@@ -279,7 +279,7 @@ Condition : ((isnotnull(d_year#33) AND (d_year#33 = 2000)) AND isnotnull(d_date_
279279
Input [2]: [d_date_sk#6, d_year#33]
280280
Arguments: [d_date_sk#6], [d_date_sk#6]
281281

282-
(48) CometNativeColumnarToRow
282+
(48) CometColumnarToRow [codegen id : 1]
283283
Input [1]: [d_date_sk#6]
284284

285285
(49) BroadcastExchange

spark/src/test/resources/tpcds-plan-stability/approved-plans-v1_4-spark3_5/q1.native_datafusion/extended.txt

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ TakeOrderedAndProject
77
: : +- BroadcastHashJoin
88
: : :- Filter
99
: : : +- HashAggregate
10-
: : : +- CometNativeColumnarToRow
10+
: : : +- CometColumnarToRow
1111
: : : +- CometColumnarExchange
1212
: : : +- HashAggregate
1313
: : : +- Project
@@ -17,23 +17,23 @@ TakeOrderedAndProject
1717
: : : : +- Scan parquet spark_catalog.default.store_returns [COMET: Native DataFusion scan does not support subqueries/dynamic pruning]
1818
: : : : +- SubqueryBroadcast
1919
: : : : +- BroadcastExchange
20-
: : : : +- CometNativeColumnarToRow
20+
: : : : +- CometColumnarToRow
2121
: : : : +- CometProject
2222
: : : : +- CometFilter
2323
: : : : +- CometNativeScan parquet spark_catalog.default.date_dim
2424
: : : +- BroadcastExchange
25-
: : : +- CometNativeColumnarToRow
25+
: : : +- CometColumnarToRow
2626
: : : +- CometProject
2727
: : : +- CometFilter
2828
: : : +- CometNativeScan parquet spark_catalog.default.date_dim
2929
: : +- BroadcastExchange
3030
: : +- Filter
3131
: : +- HashAggregate
32-
: : +- CometNativeColumnarToRow
32+
: : +- CometColumnarToRow
3333
: : +- CometColumnarExchange
3434
: : +- HashAggregate
3535
: : +- HashAggregate
36-
: : +- CometNativeColumnarToRow
36+
: : +- CometColumnarToRow
3737
: : +- CometColumnarExchange
3838
: : +- HashAggregate
3939
: : +- Project
@@ -43,17 +43,17 @@ TakeOrderedAndProject
4343
: : : +- Scan parquet spark_catalog.default.store_returns [COMET: Native DataFusion scan does not support subqueries/dynamic pruning]
4444
: : : +- ReusedSubquery
4545
: : +- BroadcastExchange
46-
: : +- CometNativeColumnarToRow
46+
: : +- CometColumnarToRow
4747
: : +- CometProject
4848
: : +- CometFilter
4949
: : +- CometNativeScan parquet spark_catalog.default.date_dim
5050
: +- BroadcastExchange
51-
: +- CometNativeColumnarToRow
51+
: +- CometColumnarToRow
5252
: +- CometProject
5353
: +- CometFilter
5454
: +- CometNativeScan parquet spark_catalog.default.store
5555
+- BroadcastExchange
56-
+- CometNativeColumnarToRow
56+
+- CometColumnarToRow
5757
+- CometProject
5858
+- CometFilter
5959
+- CometNativeScan parquet spark_catalog.default.customer

0 commit comments

Comments
 (0)