Skip to content

Commit b05e267

Browse files
authored
chore: remove redundant parquet.enable.dictionary ConfigMatrix from SQL tests (#3866)
The ConfigMatrix directive for parquet.enable.dictionary was added as boilerplate to all SQL test files but is redundant since the tests generally do not create tables with duplicate string data that would be dictionary-encoded. Also update contributor guides to stop recommending this directive for new tests.
1 parent 9a7e616 commit b05e267

134 files changed

Lines changed: 11 additions & 257 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/skills/review-comet-pr/SKILL.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -149,8 +149,6 @@ Categories include: `aggregate/`, `array/`, `string/`, `math/`, `struct/`, `map/
149149
**SQL file structure:**
150150

151151
```sql
152-
-- ConfigMatrix: parquet.enable.dictionary=false,true
153-
154152
-- Create test data
155153
statement
156154
CREATE TABLE test_crc32(col string, a int, b float) USING parquet

docs/source/contributor-guide/adding_a_new_expression.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -217,8 +217,6 @@ It is important to verify that the new expression is correctly recognized by the
217217
Create a `.sql` file under the appropriate subdirectory in `spark/src/test/resources/sql-tests/expressions/` (e.g., `string/`, `math/`, `array/`). The file should create a table with test data, then run queries that exercise the expression. Here is an example for the `unhex` expression:
218218

219219
```sql
220-
-- ConfigMatrix: parquet.enable.dictionary=false,true
221-
222220
statement
223221
CREATE TABLE test_unhex(col string) USING parquet
224222

docs/source/contributor-guide/sql-file-tests.md

Lines changed: 9 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -76,8 +76,6 @@ A test file consists of SQL comments, directives, statements, and queries separa
7676
lines. Here is a minimal example:
7777

7878
```sql
79-
-- ConfigMatrix: parquet.enable.dictionary=false,true
80-
8179
statement
8280
CREATE TABLE test_abs(v double) USING parquet
8381

@@ -106,16 +104,19 @@ Runs the entire file once per combination of values. Multiple `ConfigMatrix` lin
106104
cross product of all combinations.
107105

108106
```sql
109-
-- ConfigMatrix: parquet.enable.dictionary=false,true
107+
-- ConfigMatrix: spark.sql.optimizer.inSetConversionThreshold=100,0
110108
```
111109

112110
This generates two test cases:
113111

114112
```
115-
sql-file: expressions/cast/cast.sql [parquet.enable.dictionary=false]
116-
sql-file: expressions/cast/cast.sql [parquet.enable.dictionary=true]
113+
sql-file: expressions/conditional/in_set.sql [spark.sql.optimizer.inSetConversionThreshold=100]
114+
sql-file: expressions/conditional/in_set.sql [spark.sql.optimizer.inSetConversionThreshold=0]
117115
```
118116

117+
Only add a `ConfigMatrix` directive when there is a real reason to run the test under
118+
multiple configurations. Do not add `ConfigMatrix` directives speculatively.
119+
119120
#### `MinSparkVersion`
120121

121122
Skips the file when running on a Spark version older than the specified version.
@@ -223,12 +224,9 @@ SELECT array(1, 2, 3)[10]
223224

224225
2. Add the Apache license header as a SQL comment.
225226

226-
3. Add a `ConfigMatrix` directive if the test should run with multiple Parquet configurations.
227-
Most expression tests use:
228-
229-
```sql
230-
-- ConfigMatrix: parquet.enable.dictionary=false,true
231-
```
227+
3. Add a `ConfigMatrix` directive only if the test needs to run under multiple configurations
228+
(e.g., testing behavior that varies with a specific Spark config). Do not add `ConfigMatrix`
229+
directives speculatively.
232230

233231
4. Create tables and insert test data using `statement` blocks. Include edge cases such as
234232
`NULL`, boundary values, and negative numbers.

spark/src/test/resources/sql-tests/expressions/aggregate/aggregate_filter.sql

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,4 +83,3 @@ SELECT AVG(d) FILTER (WHERE flag = true) FROM test_agg_filter
8383

8484
query spark_answer_only
8585
SELECT grp, AVG(d) FILTER (WHERE flag = true) FROM test_agg_filter GROUP BY grp ORDER BY grp
86-

spark/src/test/resources/sql-tests/expressions/aggregate/avg.sql

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,6 @@
1515
-- specific language governing permissions and limitations
1616
-- under the License.
1717

18-
-- ConfigMatrix: parquet.enable.dictionary=false,true
19-
2018
statement
2119
CREATE TABLE test_avg(i int, l long, f float, d double, grp string) USING parquet
2220

spark/src/test/resources/sql-tests/expressions/aggregate/bit_agg.sql

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,6 @@
1515
-- specific language governing permissions and limitations
1616
-- under the License.
1717

18-
-- ConfigMatrix: parquet.enable.dictionary=false,true
19-
2018
statement
2119
CREATE TABLE test_bit_agg(i int, grp string) USING parquet
2220

spark/src/test/resources/sql-tests/expressions/aggregate/corr.sql

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,6 @@
1616
-- under the License.
1717

1818
-- Config: spark.comet.expression.Corr.allowIncompatible=true
19-
-- ConfigMatrix: parquet.enable.dictionary=false,true
2019

2120
statement
2221
CREATE TABLE test_corr(x double, y double, grp string) USING parquet

spark/src/test/resources/sql-tests/expressions/aggregate/count.sql

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,6 @@
1515
-- specific language governing permissions and limitations
1616
-- under the License.
1717

18-
-- ConfigMatrix: parquet.enable.dictionary=false,true
19-
2018
statement
2119
CREATE TABLE test_count(i int, s string, grp string) USING parquet
2220

spark/src/test/resources/sql-tests/expressions/aggregate/covariance.sql

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
12
-- Licensed to the Apache Software Foundation (ASF) under one
23
-- or more contributor license agreements. See the NOTICE file
34
-- distributed with this work for additional information
@@ -15,8 +16,6 @@
1516
-- specific language governing permissions and limitations
1617
-- under the License.
1718

18-
-- ConfigMatrix: parquet.enable.dictionary=false,true
19-
2019
statement
2120
CREATE TABLE test_covar(x double, y double, grp string) USING parquet
2221

spark/src/test/resources/sql-tests/expressions/aggregate/min_max.sql

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,6 @@
1515
-- specific language governing permissions and limitations
1616
-- under the License.
1717

18-
-- ConfigMatrix: parquet.enable.dictionary=false,true
19-
2018
statement
2119
CREATE TABLE test_min_max(i int, d double, s string, grp string) USING parquet
2220

0 commit comments

Comments
 (0)