Skip to content

refactor(storage): compact/recluster flow and clustering stats derivation#19754

Merged
zhyass merged 11 commits intodatabendlabs:mainfrom
zhyass:feat_recluster
Apr 30, 2026
Merged

refactor(storage): compact/recluster flow and clustering stats derivation#19754
zhyass merged 11 commits intodatabendlabs:mainfrom
zhyass:feat_recluster

Conversation

@zhyass
Copy link
Copy Markdown
Member

@zhyass zhyass commented Apr 21, 2026

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

  • Refactor the responsibility boundary between compact and recluster for FUSE tables. Compact no longer performs reordering, clustering-metadata repair, or forced rewrites caused by schema drift; recluster is responsible only for reclustering by the current cluster key, and schema drift no longer triggers old-block rewrites.

  • Refine recluster semantics so it no longer piggybacks compaction behavior. Blocks that have not been reclustered under the current cluster key are now also eligible as recluster candidates, and can be locally sorted before entering merge-sort when needed.

  • Introduce BlockMetaOptions in the read path to consolidate block metadata flags such as block-index reservation, stream-column updates, and internal-column queries.

  • Update recluster candidate selection, task generation, and statistics derivation. The new flow reuses existing
    cluster_stats when available and otherwise derives min/max from column statistics; those derived values are also reused by clustering_information and by recluster decisions on older blocks and segments.

  • Add segment-summary cluster_stats backfill during mutation writes: when block-level cluster_stats are missing, the new segment summary derives min/max from its own merged col_stats and the current cluster key instead of leaving segment-level clustering stats empty.

  • Simplify the compact path. Compact no longer checks column-id drift, and no longer materializes default values for newly added columns during compaction; after ADD COLUMN, compact only performs physical compaction.

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@zhyass zhyass marked this pull request as draft April 21, 2026 18:07
@github-actions github-actions Bot added the pr-bugfix this PR patches a bug in codebase label Apr 21, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5b775db004

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@zhyass zhyass force-pushed the feat_recluster branch 3 times, most recently from 9b0a524 to 386702b Compare April 22, 2026 07:40
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9b0a524ffa

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@zhyass zhyass changed the title fix(storage): compact and recluster refactor(storage): compact/recluster flow and clustering stats derivation Apr 22, 2026
@zhyass zhyass added pr-refactor this PR changes the code base without new features or bugfix and removed pr-bugfix this PR patches a bug in codebase labels Apr 22, 2026
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@zhyass zhyass marked this pull request as ready for review April 22, 2026 14:20
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dae4fdf523

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/query/service/src/physical_plans/physical_recluster.rs Outdated
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 👍

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@dantengsky
Copy link
Copy Markdown
Member

@zhyass

Nice work. A few concerns came up while reading through the PR, but none are meant as blockers. Feel free to address them here or in follow-up PRs.

  1. The zero-task block-reorder path can degenerate into no-op commits when selected blocks do not have cluster_stats matching the current cluster key

Example: a table was originally clustered by (a), data was inserted, and later the user runs:

ALTER TABLE t CLUSTER BY (b);
ALTER TABLE t RECLUSTER;

The existing blocks still carry cluster_stats for the old key (a). Since altering the cluster key increments the cluster key id, those stats no longer match the current key (b).

During recluster, target_select derives min/max stats for (b) from col_stats.

For example, with current key (b):

S0: B0 [1,2], B1 [9,10]
S1: B2 [3,4], B3 [5,6]

At segment level, S0 [1,10] overlaps S1 [3,6], so these segments can be selected. But at block level the ranges do not overlap; they are only block-level out of order.

In this case, assuming the blocks are not selected by the small-block compaction checks, no block rewrite task is generated because the block overlap depth does not exceed the threshold.

So recluster falls into the zero-task path: tasks is empty, and all selected blocks are passed as ReclusterParts.remained_blocks.

However, ReclusterInfoSideCar.merged_blocks is populated from ReclusterParts.remained_blocks, and each entry only carries the original BlockMeta. It does not carry the derived cluster stats for the current key (b).

Commit-side segment generation then sorts by BlockMeta.cluster_stats, which still refer to (a), not (b). Since the cluster key id does not match, sort_by_cluster_stats returns Equal, preserving the original block order.

As a result, recluster can create a new snapshot / rewritten segment metadata without actually improving the segment layout.

With block_per_segment = 2, the intended reorder would build new segments like:

S0: B0 [1,2], B2 [3,4] => segment range [1,4]
S1: B3 [5,6], B1 [9,10] => segment range [5,10]

These segment ranges no longer overlap, so segment pruning can benefit from the recluster.

But if commit keeps the original order, the rewritten segments remain:

S0: B0 [1,2], B1 [9,10] => segment range [1,10]
S1: B2 [3,4], B3 [5,6] => segment range [3,6]

The segment ranges still overlap, so the metadata rewrite does not improve clustering/pruning. With RECLUSTER FINAL, the same selected segments can be processed again until the retry limit is reached, producing wasted
no-op commits.

  1. Hilbert recluster may skip compacted segments after generic cluster_stats backfill

For a Hilbert-clustered table, normal inserts do not generate block/segment cluster_stats.

After the user compacts the table, fill_missing_segment_cluster_stats may backfill segment-level cluster_stats for the compacted segment. Later, when the user runs ALTER TABLE t RECLUSTER, the Hilbert recluster
selection path may skip this segment when it is the only candidate segment.

In check_for_recluster, the single-segment branch only checks:

v.cluster_key_id != self.default_cluster_id

Since the backfilled stats use the current cluster key id, this condition is false, so the compacted segment is not selected for Hilbert recluster.

This looks like a behavior regression: before this PR, the compacted segment would still have had cluster_stats = None, and the Hilbert recluster path would have treated it as needing recluster.

Additional note:

Two places implicitly depend on ratios defined inside BlockThresholds

  • block_thresholds.rs calc_rows_for_recluster: recovers bytes_per_block from max_bytes_per_block / 2, assuming the constructor always sets max_bytes_per_block = bytes_per_block * 2.
  • transform_compact_block.rs split_blocks: hardcodes rows_per_block * 9 / 5, which assumes the allowed tail size is 0.8x of the target block size.

The comment in split_blocks does mention "min_rows_per_block is 0.8x", but a reader still has to mentally verify that 9 / 5 = 1 + 4 / 5 and that 4 / 5 matches the ratio used by BlockThresholds. If those ratios
change in BlockThresholds::new, these call sites can silently go out of sync.

Consider exposing these ratios through BlockThresholds, or adding helper methods for the derived values, so callers do not need to recover them with max_bytes_per_block / 2 or rows_per_block * 9 / 5.

@zhyass zhyass added the ci-cloud Build docker image for cloud test label Apr 29, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5d9fe6f338

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/query/storages/fuse/src/statistics/cluster_statistics.rs
fix review comments

fix test

fix test

fix test
@zhyass zhyass removed the ci-cloud Build docker image for cloud test label Apr 29, 2026
@zhyass zhyass added the ci-benchmark Benchmark: run all test label Apr 29, 2026
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@zhyass zhyass added ci-benchmark Benchmark: run all test and removed ci-benchmark Benchmark: run all test labels Apr 29, 2026
@github-actions

This comment was marked as outdated.

@zhyass
Copy link
Copy Markdown
Member Author

zhyass commented Apr 29, 2026

@codex review

@zhyass zhyass added ci-cloud Build docker image for cloud test and removed ci-benchmark Benchmark: run all test labels Apr 29, 2026
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Chef's kiss.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@github-actions
Copy link
Copy Markdown
Contributor

Docker Image for PR

  • tag: pr-19754-95e771b-1777488214

note: this image tag is only available for internal use.

@zhyass zhyass added this pull request to the merge queue Apr 30, 2026
@zhyass zhyass removed this pull request from the merge queue due to a manual request Apr 30, 2026
@zhyass zhyass added this pull request to the merge queue Apr 30, 2026
Merged via the queue into databendlabs:main with commit fe693bb Apr 30, 2026
345 of 352 checks passed
@zhyass zhyass deleted the feat_recluster branch April 30, 2026 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-cloud Build docker image for cloud test pr-refactor this PR changes the code base without new features or bugfix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants