Skip to content

refactor(query): simplify SpillsBufferPool with owned runtime and configurable settings#19781

Merged
zhang2014 merged 11 commits intodatabendlabs:mainfrom
zhang2014:refactor/spill_buffer_pool
Apr 28, 2026
Merged

refactor(query): simplify SpillsBufferPool with owned runtime and configurable settings#19781
zhang2014 merged 11 commits intodatabendlabs:mainfrom
zhang2014:refactor/spill_buffer_pool

Conversation

@zhang2014
Copy link
Copy Markdown
Member

@zhang2014 zhang2014 commented Apr 27, 2026

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

refactor(query): simplify SpillsBufferPool with owned runtime and configurable settings

  • SpillsBufferPool now owns a dedicated Runtime instead of borrowing GlobalIORuntime
  • Remove SpillTarget from public buffer pool APIs; callers no longer derive it
  • Add spill_buffer_pool_memory and spill_buffer_pool_workers to SpillConfig
  • Track buffer pool blocking time via atomic counters for observability
  • Simplify BufferWriter by removing SpillTarget and redundant comments
  • Clean up tests to use multi_thread tokio flavor

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

…figurable settings

- SpillsBufferPool now owns a dedicated Runtime instead of borrowing GlobalIORuntime
- Remove SpillTarget from public buffer pool APIs; callers no longer derive it
- Add spill_buffer_pool_memory and spill_buffer_pool_workers to SpillConfig
- Track buffer pool blocking time via atomic counters for observability
- Simplify BufferWriter by removing SpillTarget and redundant comments
- Clean up tests to use multi_thread tokio flavor
@github-actions github-actions Bot added the pr-refactor this PR changes the code base without new features or bugfix label Apr 27, 2026
@zhang2014 zhang2014 added the ci-cloud Build docker image for cloud test label Apr 27, 2026
@zhang2014 zhang2014 marked this pull request as draft April 27, 2026 14:13
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0ffaf5413d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/query/service/src/spillers/async_buffer.rs
Comment thread src/query/config/src/config.rs
Comment thread src/query/service/src/spillers/async_buffer.rs Outdated
@github-actions
Copy link
Copy Markdown
Contributor

Docker Image for PR

  • tag: pr-19781-c5996dc-1777302182

note: this image tag is only available for internal use.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 27, 2026

🤖 CI Job Analysis (Retry 2)

Workflow: 25076352983

📊 Summary

  • Total Jobs: 87
  • Failed Jobs: 1
  • Retryable: 1
  • Code Issues: 0

AUTO-RETRY INITIATED

1 job(s) retried due to infrastructure issues (runner failures, timeouts, etc.)

View Progress

🔍 Job Details

  • 🔄 linux / test_stateless_standalone: ✅ Retryable (Infrastructure)

🤖 About

Automated analysis using job annotations to distinguish infrastructure issues (auto-retried) from code/test issues (manual fixes needed).

zhang2014 and others added 9 commits April 28, 2026 11:53
…eateWriter ops

Background workers were directly awaiting Fetch and CreateWriter ops in
their event loop, blocking the worker thread for the duration of the I/O.
With only 2 workers, concurrent Fetch/CreateWriter ops could starve
reader_task_loop tasks (spawned via tokio::spawn onto the same runtime),
causing recv_blocking() in SpillsDataReader::read() to hang indefinitely.

Fix: spawn Fetch and CreateWriter as independent tasks, consistent with
WriterTask and ReaderTask, so workers remain free to dequeue new ops.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@zhang2014 zhang2014 marked this pull request as ready for review April 28, 2026 20:03
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f91b43173b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/query/service/src/spillers/async_buffer.rs
@zhang2014 zhang2014 merged commit 04ea6c1 into databendlabs:main Apr 28, 2026
260 of 265 checks passed
zhang2014 added a commit that referenced this pull request Apr 29, 2026
…figurable settings (#19781)

* refactor(query): simplify SpillsBufferPool with owned runtime and configurable settings

- SpillsBufferPool now owns a dedicated Runtime instead of borrowing GlobalIORuntime
- Remove SpillTarget from public buffer pool APIs; callers no longer derive it
- Add spill_buffer_pool_memory and spill_buffer_pool_workers to SpillConfig
- Track buffer pool blocking time via atomic counters for observability
- Simplify BufferWriter by removing SpillTarget and redundant comments
- Clean up tests to use multi_thread tokio flavor

* z

* z

* fix(query): prevent hang in SpillsBufferPool by spawning Fetch and CreateWriter ops

Background workers were directly awaiting Fetch and CreateWriter ops in
their event loop, blocking the worker thread for the duration of the I/O.
With only 2 workers, concurrent Fetch/CreateWriter ops could starve
reader_task_loop tasks (spawned via tokio::spawn onto the same runtime),
causing recv_blocking() in SpillsDataReader::read() to hang indefinitely.

Fix: spawn Fetch and CreateWriter as independent tasks, consistent with
WriterTask and ReaderTask, so workers remain free to dequeue new ops.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* z

* z

* z

* z

* z

* z

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-cloud Build docker image for cloud test pr-refactor this PR changes the code base without new features or bugfix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant