Skip to content

[SPARK-57261][SQL] Allow to disable HashAggregateExec by config#56323

Open
pan3793 wants to merge 3 commits into
apache:masterfrom
pan3793:SPARK-57261
Open

[SPARK-57261][SQL] Allow to disable HashAggregateExec by config#56323
pan3793 wants to merge 3 commits into
apache:masterfrom
pan3793:SPARK-57261

Conversation

@pan3793
Copy link
Copy Markdown
Member

@pan3793 pan3793 commented Jun 4, 2026

What changes were proposed in this pull request?

Currently, Spark always prefers to use HashAggregateExec over SortAggregateExec if possible, this PR adds a config spark.sql.execution.useHashAggregateExec to allow users to disable HashAggregateExec explicitly.

Why are the changes needed?

We found some jobs fail with HashAggregateExec due to OOM (auto fallback logic does not work well), and it runs well with SortAggregateExec

26/06/04 18:47:30 ERROR [SIGTERM handler] CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM
26/06/04 18:47:30 WARN [Executor task launch worker for task 9749.0 in stage 14.0 (TID 61758)] TaskMemoryManager: Failed to allocate a page (2147483648 bytes) for 0 times, try again.
java.lang.OutOfMemoryError: Java heap space
	at org.apache.spark.unsafe.memory.HeapMemoryAllocator.allocate(HeapMemoryAllocator.java:72)
	at org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:398)
	at org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:359)
	at org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:96)
	at org.apache.spark.unsafe.map.BytesToBytesMap.allocate(BytesToBytesMap.java:868)
	at org.apache.spark.unsafe.map.BytesToBytesMap.growAndRehash(BytesToBytesMap.java:991)
	at org.apache.spark.unsafe.map.BytesToBytesMap$Location.append(BytesToBytesMap.java:817)
	at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.getAggregationBufferFromUnsafeRow(UnsafeFixedWidthAggregationMap.java:135)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage11.hashAgg_doConsume_0$(Unknown Source)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage11.hashAgg_doAggregateWithKeys_0$(Unknown Source)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage11.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:44)
	at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:50)
	at scala.collection.Iterator$$anon$9.hasNext(Iterator.scala:593)
	at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:195)
	at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:57)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:111)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:206)
	at org.apache.spark.scheduler.Task.run(Task.scala:147)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:900)
	at org.apache.spark.executor.Executor$TaskRunner$$Lambda$709/0x00007f84474fd558.apply(Unknown Source)
	at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:86)
	at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:83)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:903)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)

Does this PR introduce any user-facing change?

No.

How was this patch tested?

UT is tuned.

Also verified with a production case, HashAggregateExec vs SortAggregateExec

Xnip2026-06-04_21-23-43 Xnip2026-06-04_21-22-38

Was this patch authored or co-authored using generative AI tooling?

No.

@pan3793
Copy link
Copy Markdown
Member Author

pan3793 commented Jun 5, 2026

cc @cloud-fan @LuciferYang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants