[draft](ivm) Add mv ivm test for pipeline#62606
Draft
yujun777 wants to merge 87 commits intoapache:masterfrom
Draft
[draft](ivm) Add mv ivm test for pipeline#62606yujun777 wants to merge 87 commits intoapache:masterfrom
yujun777 wants to merge 87 commits intoapache:masterfrom
Conversation
…ogging - Require non-null detailMessage in IVMRefreshResult.fallback() to match IVMCapabilityResult.unsupported() contract; remove single-arg overload - Add toString() to IVMRefreshResult for log readability - Add WARN logging on all fallback paths in IVMRefreshManager with MV name - Make doRefresh() the public API; remove redundant ivmRefresh() wrapper
- Remove IVMPlanPattern, IVMPlanAnalysis, IVMPlanAnalyzer, IVMDeltaPlannerDispatcher - IVMCapabilityChecker now takes List<DeltaPlanBundle> instead of IVMPlanAnalysis - IVMRefreshManager simplified to 2 deps: capabilityChecker + deltaExecutor - Delta bundles produced by Nereids rules, retrieved via MTMVAnalyzeQueryInfo - Add analyzeDeltaBundles() hook for testability - Add ivmDeltaBundles to MTMVAnalyzeQueryInfo, populated from CascadesContext - Update tests to JUnit 5 and new interface signatures - Fix checkstyle import order in CreateTableCommandTest
- Delete IvmRewriteMtmvPlan placeholder and its test - Remove rewriteRootPlan field from CascadesContext (no longer needed) - Replace IVM_REWRITE_MTMV_PLAN with IVM_NORMALIZE_MTMV_PLAN in RuleType - Add IvmNormalizeMtmvPlan skeleton (row-id injection, avg rewrite, TODO) - Add IvmDeltaScanOnly and IvmDeltaAggRoot skeletons - Merge delta rules into single topic in Rewriter
- Add IvmAnalyzeMode enum (NONE/NORMALIZE_ONLY/FULL) to replace boolean flags - Replace enableIvmRewriteInNereids with enableIvmNormalRewrite + enableIvmDeltaRewrite - MTMVPlanUtil.analyzeQuery/analyzeQueryWithSql take IvmAnalyzeMode parameter - CreateMTMVInfo: NORMALIZE_ONLY for incremental MV, NONE otherwise - ensureMTMVQueryUsable: same mode as CREATE MV - IVMRefreshManager: FULL mode (normalize + delta) - Update IvmNormalizeMtmvPlan/IvmDeltaScanOnly/IvmDeltaAggRoot to use new session vars - Fix MTMVPlanUtilTest: JUnit5, IvmAnalyzeMode.NONE, updated CountingSessionVariable - Add testAnalyzeQueryIvmAnalyzeModeSetSessionVariables covering all 3 modes
- Add IvmContext: holds Map<Slot, isDeterministic> rowIdDeterminism + List<DeltaPlanBundle>
- Replace ivmDeltaBundles in CascadesContext with Optional<IvmContext>
- IvmNormalizeMtmvPlan: whitelist-based visitor (DefaultPlanRewriter<IvmContext>)
- visitLogicalOlapScan: inject __IVM_ROW_ID__ at index 0 via LogicalProject
- MOW: Alias(buildRowIdHash(uk...), __IVM_ROW_ID__) -> deterministic
- DUP_KEYS: Alias(UuidNumeric(), __IVM_ROW_ID__) -> non-deterministic
- MOR / AGG_KEYS: throw AnalysisException
- visitLogicalProject: propagate child row-id; throw if child has none
- visit: throw for any unwhitelisted node
- buildRowIdHash: uses murmur_hash3_64 (TODO: replace with 128-bit hash)
- MTMVPlanUtil: read delta bundles from IvmContext instead of direct field
- Tests: DUP_KEYS, MOW (deterministic), MOR (throws), AGG_KEYS (throws),
project propagation, unsupported node, gate disabled
- IvmNormalizeMtmvPlan: whitelist LogicalResultSink, prepend row-id; extract hasRowIdInOutputs/prependRowId helpers - ColumnDefinition: add newIvmRowIdColumnDefinition with mv_ prefix - MTMVPlanUtil: prepend row-id ColumnDefinition at index 0; reset IVM session vars in finally block to prevent test leakage - BaseViewInfo: extract static rewriteProjectsToUserDefineAlias overload - CreateMTMVInfo: fix rewriteQuerySql to snapshot/restore rewrite map and call alias rewrite when simpleColumnDefinitions present - CreateTableCommandTest: add 4 IVM UTs covering scan, project-scan, no-alias, alias rewrite, and column count mismatch
- CreateMTMVInfo: set UNIQUE_KEYS + enable_unique_key_merge_on_write=true
for INCREMENTAL refresh MVs; reject user-specified key columns
- MTMVPlanUtil.analyzeKeys: return new List instead of mutating the
immutable input list; throw if IVM row-id column not found in columns
- MTMVPlanUtil.analyzeQuery: only reset IVM session vars in finally block
for modes that actually set them (NORMALIZE_ONLY resets NORMAL only,
FULL resets both, NONE resets neither)
- MTMVPlanUtilTest: add 4 new UTs covering UNIQUE_KEYS+MOW assertion,
DUP_KEYS for non-IVM, and rejection of user-specified UNIQUE/DUP keys
- CountingSessionVariable: count only enabling ("true") setVarOnce calls
…aRewriter Move IVM delta plan generation out of Nereids rewrite rules into an external IvmDeltaRewriter that will be called by IVMRefreshManager. IvmNormalizeMtmvPlan now stores the normalized plan in IvmContext so IVMRefreshManager can retrieve it for delta rewriting. - Add normalizedPlan field to IvmContext, store after normalization - Add ivmNormalizedPlan field to MTMVAnalyzeQueryInfo - Delete IvmDeltaScanOnly, IvmDeltaAggRoot, IvmAnalyzeMode - Remove IVM_DELTA_SCAN_ONLY/IVM_DELTA_AGG_ROOT from RuleType/Rewriter - Remove ENABLE_IVM_DELTA_REWRITE session variable - Remove deltaCommandBundles from IvmContext - Replace IvmAnalyzeMode enum with boolean enableIvmNormalize - Create skeleton IvmDeltaRewriter + IvmDeltaRewriteContext - Rewrite IVMRefreshManager.analyzeDeltaCommandBundles to use normalized plan (returns empty bundles for now, triggers fallback)
IvmDeltaRewriter no longer extends DefaultPlanRewriter. It now validates the normalized plan is a supported scan-only or project-scan pattern, extracts the base table, and produces an INSERT INTO mv command wrapped in a DeltaCommandBundle. IvmDeltaRewriteContext gains a ConnectContext field, and IVMRefreshManager.analyzeDeltaCommandBundles is wired to call the rewriter.
…es to concrete classes IVMDeltaExecutor now contains real execution logic following the MTMVTask.exec() pattern: creates ConnectContext/StatementContext/StmtExecutor, runs the command, and checks query state. IVMCapabilityChecker returns ok() by default. IVMRefreshManager uses a no-arg public constructor, instantiating both collaborators internally, with a @VisibleForTesting constructor for injection.
For INCREMENTAL MVs, attempt IVM refresh first via IVMRefreshManager. On success, return early and skip partition-based refresh. On fallback, log the reason and continue with existing refresh path.
…eltaExecutor Extract common command execution boilerplate shared by MTMVTask.exec() and IVMDeltaExecutor.executeBundle() into MTMVPlanUtil.executeCommand(). This also adds the missing audit logging to IVM delta execution.
Keep the hidden IVM row id in refresh planning and exclude it from MV nondeterministic checks. Adjust exchange fragment output expr handling for incremental refresh, rename the MV-specific collector, and add FE UT plus mtmv regression coverage. Tests: ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.IvmNormalizeMtmvPlanTest,org.apache.doris.nereids.trees.plans.PlanVisitorTest,org.apache.doris.nereids.trees.plans.commands.UpdateMvByPartitionCommandTest Tests: ./run-regression-test.sh --run -d mtmv_p0 -s test_ivm_basic_mtmv
Ensure the root fragment always rewrites output exprs from the final physical plan outputs so aggregate and TopN plans do not keep stale SlotRefs. Add FE/unit and regression coverage for MTMV hidden row-id changes after complete refresh.
Disable table-sink MV rewrite in the MTMV refresh execution context so refresh planning cannot rewrite back to the target MV. Add a SessionVariable setter and extend UpdateMvByPartitionCommandTest to assert both MV rewrite switches are disabled for the refresh executor. Test: ./run-fe-ut.sh --run org.apache.doris.nereids.trees.plans.commands.UpdateMvByPartitionCommandTest
Move IVM normalization after sink binding so incremental MTMV inserts keep hidden columns aligned with bound olap sink outputs and target slots. Tests: - bash ./run-fe-ut.sh --run org.apache.doris.nereids.trees.plans.CreateTableCommandTest - bash ./run-fe-ut.sh --run org.apache.doris.mtmv.MTMVTest,org.apache.doris.nereids.rules.rewrite.IvmNormalizeMtmvPlanTest
… normalization - Remove dead condition in BindSink.getColumnToChildOutput: the second clause of the IVM hidden column skip guard was always true by definition of missingIvmHiddenColumns (columns guaranteed absent from child output) - Add integration test testSinkWithPlaceholderChildReplacesRowIdAndPreservesExprId covering the BindSink placeholder -> IvmNormalizeMtmvPlan replacement pipeline - Fix checkstyle import order violations in BindSink, IvmNormalizeMtmvPlan, MTMVTest, and IvmNormalizeMtmvPlanTest introduced in the previous commit
### What problem does this PR solve? Issue Number: None Related PR: None Problem Summary: Rename the IVM MTMV normalize rule class, its RuleType constant, and the matching FE unit test to remove the stale Plan suffix and keep the analyzer registration aligned with the new symbol names. ### Release note None ### Check List (For Author) - Test: FE unit test via bash ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.IvmNormalizeMtmvTest - Behavior changed: No - Does this need documentation: No
…ad of internal HASH(row_id) ### What problem does this PR solve? Issue Number: close #xxx Problem Summary: IVM materialized views internally rewrite distribution to HASH(__DORIS_IVM_ROW_ID_COL__), but SHOW CREATE MATERIALIZED VIEW was exposing this internal physical detail. Since row_id is a hidden column invisible to users, the DDL output should show DISTRIBUTED BY RANDOM with the same bucket count/auto-bucket setting instead. This makes the output re-executable and preserves the bucket configuration on re-creation. Also updates CreateMTMVCommandTest to reflect that MIN/MAX aggregates are now supported for IVM (no longer throws AnalysisException), and adds a roundtrip test (TC-4-8) that verifies SHOW CREATE DDL can recreate an identical IVM MV with preserved bucket count. ### Release note None ### Check List (For Author) - Test: Unit Test (ShowCreateMTMVTest 8/8, CreateMTMVCommandTest 24/24) - Behavior changed: Yes (IVM SHOW CREATE now outputs DISTRIBUTED BY RANDOM instead of HASH on hidden column) - Does this need documentation: No
…otes Add development guidelines for the IVM module covering: - Recommended regression and FE unit test suites to run before committing - Documentation that binlog/stream is not ready (delta is mocked via full scan) - No backward compatibility requirement before July 2026 public release
…R hints When an MTMV definition SQL contains SET_VAR hints (e.g., /*+ SET_VAR(enable_force_spill = true) */), the COMPLETE refresh task crashes with NPE at SelectHintSetVar.setVarOnceInSql() because ConnectContext.get().getStatementContext() returns null. The StatementContext was only installed on the ConnectContext inside MTMVPlanUtil.executeCommand(), but UpdateMvByPartitionCommand.from() calls NereidsParser.parseSingle() earlier, which triggers LogicalPlanBuilder.withHints() that requires the StatementContext. Key changes: - Install StatementContext on ConnectContext immediately after creation in MTMVTask.exec(), before parsing the MV definition SQL - Update test_commit_mtmv.out expected output to include the new "refreshMode" field in TaskContext JSON Unit Test: IvmAggDeltaStrategyTest, IvmNormalizeMtmvTest, CreateMTMVCommandTest, ShowCreateMTMVTest (all pass) Regression Test: test_mv_case, test_commit_mtmv, test_ivm_agg_mtmv (all pass)
…_true constant folding bug Problem: IVM delta rewrite hardcoded dml_factor=1 (insert-only assumption), ignoring delete operations. Additionally, assert_true guards in MIN/MAX boundary checks were silently eliminated by FoldConstantRuleOnFE.visitIf when both IF branches were identical. Key changes: - IvmSimpleScanDeltaStrategy.visitLogicalOlapScan now checks for a binlog_op column in the base table; if present, derives dml_factor = IF(binlog_op = 0, 1, -1) instead of literal 1. This enables delete-aware INCREMENTAL refresh for tables with binlog_op. - Fix assert_true being folded away: change IF false branches from identical expressions to NullLiteral, preventing FoldConstantRuleOnFE from collapsing IF(cond, x, x) into x. Affects assertNonNegative, MIN guard, and MAX guard in IvmAggDeltaStrategy. - Rename transientDelHiddenName prefix to use Column.IVM_HIDDEN_COLUMN_PREFIX convention. - Add Column.BINLOG_OPERATION_COL constant. Unit tests: - IvmSimpleScanDeltaStrategyTest: 14 tests (3 new for op-based dml_factor) - IvmAggDeltaStrategyTest: 18 tests (all pass with NullLiteral fix) - IvmNormalizeMtmvTest: 23 tests - CreateMTMVCommandTest: 24 tests - ShowCreateMTMVTest: 8 tests Regression tests: - test_ivm_basic_mtmv Part 2: scan MV + binlog_op delete semantics - test_ivm_basic_mtmv Part 3: filter + scan + binlog_op delete propagation - test_ivm_agg_mtmv Part 4: agg MV MIN boundary delete -> assert_true guard -> FAILED -> COMPLETE recovery
…cument binlog_op/row_id Reduce code duplication in IvmAggDeltaStrategy by extracting four shared helper methods and consolidating SUM/AVG into a single case block. Also document the binlog_op-based dml_factor derivation and row_id generation approach in the IVM AGENTS.md. Key changes: - buildExtremalDeltaOutputs: shared MIN/MAX delta aggregate output builder - putExtremalSemanticSlots: shared MIN/MAX semantic slot mapping - buildNewCount: shared assertNonNegative(COALESCE+delta) for all agg types - buildExtremalTargetExpressions: shared MIN/MAX guard+merge+visible logic - Consolidate SUM and AVG cases in buildTargetExpressions - Document dml_factor from binlog_op and row_id generation in AGENTS.md Unit Test: IvmAggDeltaStrategyTest (18), IvmSimpleScanDeltaStrategyTest (14), IvmNormalizeMtmvTest (23), CreateMTMVCommandTest (24), ShowCreateMTMVTest (8)
Add Part 5 and Part 6 to test_ivm_agg_mtmv to cover NULL value handling in IVM incremental aggregation with binlog_op-based delta. Key changes: - Part 5: grouped agg MV with NULL values across SUM/AVG/COUNT/MIN/MAX, including NULL insertion, NULL deletion via binlog_op=1 - Part 6: scalar agg MV starting from all-NULL values, transitioning to mixed NULL/non-NULL via incremental inserts Unit Test: - test_ivm_agg_mtmv regression test (all 6 parts pass)
…calar deletion, and MAX boundary Add three new regression test parts to test_ivm_agg_mtmv: Problem: IVM agg test coverage had gaps around group deletion, scalar empty table, and MAX boundary deletion scenarios. Key changes: - Part 7: Group disappearing — all rows deleted via binlog_op=1, group_count reaches 0, DELETE_SIGN=1 removes group row from MV, then group resurrection - Part 8: Scalar agg with binlog_op deletes — verifies scalar row persistence through delete/insert cycles (COUNT+SUM only, no MIN/MAX to avoid boundary guards) - Part 9: MAX boundary deletion — symmetric to Part 4 (MIN), verifies assert_true guard fires when deleting MAX value, INCREMENTAL FAILs, COMPLETE recovers Unit Test: Regression test test_ivm_agg_mtmv passes (all 9 parts)
Fix AVG(DECIMAL) Divide ClassCastException in IvmAggDeltaStrategy where newSum is DecimalV3 but newCount is BIGINT, causing Divide.getDataType() to fail. Cast newCount to newSum type before creating the Divide. Key changes: - Fix: cast newCount to newSum type in AVG visible value computation - B1: Scalar MIN/MAX plan structure tests (no net-zero filter, constant delete sign, assert_true guard) - B2: Combined MIN+MAX produces two independent assert_true guards - B3: Boundary guard predicate is compound Or expression with correct error message - B4: Composite GROUP BY k1,k2 plan structure (multi-key grouping, row_id join) - Part 10: Composite group keys regression test with group deletion - Part 11: Multiple same-type aggs (SUM/SUM/MIN/MAX) hidden state isolation - Part 12: Negative values / SUM cancellation crossing zero - Part 13: Empty delta produces NOT_REFRESH - Part 14: Type widening (TINYINT/DECIMAL/DOUBLE) regression test - Part 15: Expressions in agg args negative test Unit Test: IvmAggDeltaStrategyTest (23 tests), all IVM FE tests (92 tests) Regression: test_ivm_agg_mtmv (Parts 1-15), all IVM regression suites pass
Problem: murmur_hash3_64 implements PropagateNullable, so any NULL group key argument makes the entire hash return NULL. This causes different groups with NULL keys to collide (e.g., (NULL,'a') and (NULL,'b') both produce row_id=NULL), leading to incorrect incremental refresh results. Key changes: - Replace direct hash(keys...) with null-safe pattern: hash(ifnull(cast(k AS VARCHAR),''), cast(isnull(k) AS VARCHAR), ...) per key - Add MurmurHash364(List<Expression>) constructor for cleaner list-based construction - Remove unused IvmUtil.newIvmCountColumnDefinition() - Update IVM AGENTS.md with null-safe row_id documentation Unit Test: - IvmUtilTest: 7 new tests verifying expression tree structure and non-nullability - All 99 existing IVM FE unit tests pass Regression Test: - test_ivm_agg_4: Parts 16-18 covering single/multiple NULL group keys and empty-string vs NULL distinction
…ions Previously IVM rejected compound expressions like SUM(v1+v2) or MIN(v1*2) inside aggregate functions, requiring bare column Slots only. This relaxes the constraint to accept arbitrary expressions as aggregate arguments. Key changes: - IvmAggMeta.AggTarget: exprSlots (List<Slot>) -> exprArgs (List<Expression>) - IvmNormalizeMtmv: removed instanceof Slot check in buildHiddenStateForAgg - IvmAggDeltaStrategy: widened helper method params from Slot to Expression - Renamed all misleading XXXSlot variables/methods to XXXArg where type is Expression Unit Test: IvmNormalizeMtmvTest (25), IvmAggDeltaStrategyTest (25), all 103 IVM FE tests pass Regression Test: all 7 IVM regression suites pass
…code Replace bare "SUM"/"COUNT"/"MIN"/"MAX" strings used as hidden state slot keys with a type-safe StateKey enum in IvmAggMeta. This prevents typos and provides compile-time safety. Also extract addHiddenSumAndCount() and addHiddenAlias() helper methods in IvmNormalizeMtmv to eliminate SUM/AVG code duplication. Key changes: - Add IvmAggMeta.StateKey enum with SUM, COUNT, MIN, MAX values - Change AggTarget.hiddenStateSlots from Map<String,Slot> to Map<StateKey,Slot> - Update all callsites in IvmAggDeltaStrategy and IvmNormalizeMtmv - Add DELMIN/DELMAX as private static final String constants (transient keys) - Extract addHiddenSumAndCount() and addHiddenAlias() in IvmNormalizeMtmv - Update IvmNormalizeMtmvTest to use StateKey Unit Test: 103 IVM FE unit tests pass Regression Test: all 7 IVM suites pass
When an IVM INCREMENTAL refresh fails because a deleted row equals
the current MIN or MAX aggregate value, the assert_true guard fires
at runtime. Previously this was caught as a generic
INCREMENTAL_EXECUTION_FAILED reason, making it hard to distinguish
from real execution errors in logs and task error messages.
Key changes:
- IvmRefreshManager.doRefreshInternal() inspects the caught exception
message for the boundary guard marker ("IVM: deleted row may be
current") and sets IvmFallbackReason.MIN_MAX_BOUNDARY_HIT, which
was defined but never used before
- Boundary hits are logged at INFO level (expected path) while other
execution failures remain at WARN level
- Update regression test error-message assertions (Parts 4 and 9) to
also check for MIN_MAX_BOUNDARY in the reason string
Unit Test: 103 FE unit tests pass; 7/7 IVM regression suites pass
### What problem does this PR solve? Issue Number: close #xxx Problem Summary: Previously, IVM (Incremental View Maintenance) rejected materialized views with GROUP BY but no aggregate functions, throwing "GROUP BY without aggregate functions is not supported for IVM". This is unnecessarily restrictive because the unconditionally-injected hidden column `__DORIS_IVM_AGG_COUNT_COL__` alone is sufficient to track group membership for incremental maintenance. A bare GROUP BY is semantically equivalent to SELECT DISTINCT, and the delta/apply paths already handle empty aggTargets lists correctly. ### Release note Support bare GROUP BY (SELECT DISTINCT) queries in IVM materialized views. ### Check List (For Author) - Test: Unit Test (IvmNormalizeMtmvTest) + Regression test (test_ivm_agg_5) - Behavior changed: Yes — previously rejected bare GROUP BY with AnalysisException, now accepted - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ssion tests ### What problem does this PR solve? Issue Number: N/A Problem Summary: IVM agg regression suites (test_ivm_agg_1~3, 5) only verified MV output after COMPLETE refresh, leaving INCREMENTAL refresh code paths untested. Also, test_ivm_agg_5 had a logic flaw in Part 1 where consecutive INCREMENTAL refreshes inflated group counts, preventing group deletion from triggering correctly. ### Release note None ### Check List (For Author) - Test: Regression test — all 8 suites under mtmv_p0/ivm pass - Behavior changed: No - Does this need documentation: No Key changes: - test_ivm_agg_1.groovy: Add 6 order_qt_* assertions after INCREMENTAL refreshes - test_ivm_agg_2.groovy: Add 8 order_qt_* assertions after INCREMENTAL refreshes - test_ivm_agg_3.groovy: Add 4 order_qt_* assertions after INCREMENTAL refreshes - test_ivm_agg_5.groovy: Redesign Part 1 into 3 isolated Scenarios (A/B/C), each starting from a fresh COMPLETE to keep counts accurate; redesign Part 2 to combine delete+insert in one batch before a single INCREMENTAL; fix Scenario B comment errors - Regenerate test_ivm_agg_1~3.out and generate new test_ivm_agg_5.out
…S.md ### What problem does this PR solve? Issue Number: N/A Problem Summary: The binlog_op mocking guide for IVM regression tests was only in an untracked AGENTS.md under the regression-test directory. Merge it into the committed FE IVM AGENTS.md so it is preserved and visible to all contributors, with an additional note about the COMPLETE-before-delete requirement for correct group deletion testing. ### Release note None ### Check List (For Author) - Test: No need to test (documentation only) - Behavior changed: No - Does this need documentation: No
Rewrite the IVM FE unit tests that still depended on JMockit so FE test compilation works with the current test dependencies. Key changes: - replace JMockit usage in IvmDeltaExecutorTest with Mockito static mocks - rewrite RefreshMTMVInfoAnalyzeTest to use Mockito for Env and catalog setup - migrate the remaining IVM FE tests away from JMockit imports and expectations Unit Test: - mvn test -pl fe-core -Dtest="RefreshMTMVInfoAnalyzeTest" -Dmaven.build.cache.enabled=false - mvn test -pl fe-core -Dtest="IvmDeltaExecutorTest,IvmDeltaRewriterTest,IvmSimpleScanDeltaStrategyTest,IvmRefreshManagerTest" -Dmaven.build.cache.enabled=false
…anagerTest ### What problem does this PR solve? Problem Summary: IvmRefreshManagerTest still used JMockit annotations (@mocked, Expectations) which are not available as a dependency, causing FE compilation failure. ### Release note None ### Check List (For Author) - Test: No need to test (build fix only, replacing mock framework usage) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve? Issue Number: close #xxx Problem Summary: IVM (Incremental View Maintenance) was creating redundant hidden columns in the MV schema. For several aggregate types, the visible column already stores the same value as the hidden column, wasting storage and adding unnecessary complexity: - COUNT(*): hidden COUNT duplicated the global group count - COUNT(expr): hidden COUNT duplicated the visible COUNT(expr) - SUM: hidden SUM duplicated the visible SUM value - MIN/MAX: hidden MIN/MAX duplicated the visible extremal value This commit removes these redundant hidden columns. The delta apply logic now reads old state from the visible column instead. Only genuinely needed hidden columns are retained: - SUM/MIN/MAX: hidden COUNT (for assertNonNegative guard and null logic) - AVG: hidden SUM + COUNT (visible is AVG, not SUM or COUNT) Additional cleanup: - Inline addHiddenSumAndCount (only called once for AVG) - Remove hasIvmHiddenOutputInOutputs/isIvmHiddenOutput private methods - Simplify group key resolution to direct Slot casting - Add stateColumnName(StateKey) helper to AggTarget - Fix toColumn() bug in IvmDeltaTestBase (isVisible/isKey params swapped) - Update class Javadoc with accurate plan shape ### Release note None ### Check List (For Author) - Test: Unit Test (90 IVM FE UTs + 24 CreateMTMVCommandTest all pass) and Regression test (8 IVM regression tests all pass) - Behavior changed: No (internal schema optimization, no user-visible change) - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ey with AggType
### What problem does this PR solve?
Issue Number: close #xxx
Problem Summary:
The IVM (Incremental View Maintenance) code had two redundant abstractions:
1. AggType enum had separate COUNT_STAR and COUNT_EXPR values, but the only
difference is whether exprArgs is empty. Merging them into a single COUNT
type with an isCountStar() helper simplifies all switch statements.
2. StateKey enum {SUM, COUNT, MIN, MAX} was a strict subset of AggType
{COUNT, SUM, AVG, MIN, MAX} (AVG is never used as a hidden-state key).
Eliminating StateKey removes an unnecessary indirection layer.
3. caseWhenExprNotNull was renamed to ifExprNotNull since it generates an
IF expression, not a CASE WHEN.
### Release note
None
### Check List (For Author)
- Test: Unit Test (90 IVM FE UTs + 24 CreateMTMVCommandTest) and Regression test (8 IVM suites)
- Behavior changed: No
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fix compilation and checkstyle errors caused by upstream's TableNameInfo class relocation (org.apache.doris.info → org.apache.doris.catalog.info) and duplicate TStorageType import from conflict resolution. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ustness ### What problem does this PR solve? Problem Summary: 1. Remove empty if-block in buildHiddenStateForAgg COUNT branch — neither COUNT(*) nor COUNT(expr) adds hidden columns, so the if was dead code. 2. Use Count.isCountStar() instead of Count.isStar() when determining exprArgs. isCountStar() also covers COUNT() and COUNT(literal) forms, making the code robust against optimizer rewrites like COUNT(*)->COUNT(1). ### Release note None ### Check List (For Author) - Test: Unit Test (IvmNormalizeMtmvTest, IvmAggDeltaStrategyTest, CreateMTMVCommandTest all pass) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…to zero
### What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
When all non-null rows contributing to a MIN/MAX aggregate are deleted
(hidden non-null count drops to 0), the boundary guard assertion
(assert_true) would incorrectly fire because the deleted extremal value
equals the current extreme. This caused unnecessary COMPLETE fallback.
This change:
- Adds newCount==0 as the first disjunct in the guard OR condition,
bypassing the boundary check when count is zero (no boundary to protect)
- Replaces nested IF merge logic with CASE WHEN for clarity:
CASE WHEN newCount=0 THEN NULL
WHEN old IS NULL THEN deltaInsert
WHEN deltaInsert IS NULL THEN old
ELSE LEAST/GREATEST END
- Uses flat Or(ImmutableList.of(...)) instead of nested binary Or
- Updates method Javadoc to document the four-way guard condition
- Adds regression tests (test_ivm_agg_6) for two scenarios:
A) Delete all rows: cnt=0, min/max=NULL
B) Delete last non-null row with NULL rows remaining: min/max=NULL
### Release note
None
### Check List (For Author)
- Test: Regression test (test_ivm_agg_6), FE unit test pass
- Behavior changed: No
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* [improvement](fe) Add IVM create and alter validation for MTMV (#5) Issue Number: close #xxx Problem Summary: 1. Remove redundant pre-analysis in IVM analyzeQuery - single analyzeQueryInternal call 2. Unify IVM base table validation error messages (AGG_KEYS vs UNIQUE without MOW) 3. Add excluded_trigger_tables support in IvmNormalizeMtmv (transient row-id for excluded tables) 4. Block ALTER MTMV refresh method to/from INCREMENTAL 5. Validate base table models when ALTER MTMV excluded_trigger_tables 6. Extract MTMVPropertyUtil.parseTableNameInfos utility 7. Add comprehensive unit tests for all validation paths * [fix](fe) Fix IVM ExprId collision by reusing parser StatementContext ### What problem does this PR solve? Issue Number: close #xxx Problem Summary: Commit 71a3086 introduced a new StatementContext in analyzeQueryInternal() and restored the original (parser-created) StatementContext in the finally block. This caused ExprId collisions during IVM INCREMENTAL refresh because the parser StatementContext has a much smaller ExprId counter than the analysis StatementContext, and IvmRefreshManager.doRefreshInternal() reads exprIdStart from the ConnectContext's StatementContext after analysis. The fix reverts to the pre-71a3086f pattern: reuse ctx.getStatementContext() (the parser-created StatementContext) directly for analysis instead of creating a new one. This way all ExprId allocations accumulate in the same StatementContext that doRefreshInternal() later reads. ### Release note None ### Check List (For Author) - Test: Regression test (test_ivm_agg_2) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * [fix](fe) Fix buildRowId to compute proper row-id for excluded trigger tables ### What problem does this PR solve? Issue Number: close #xxx Problem Summary: buildRowId() short-circuited to UuidNumeric() for excluded trigger tables, losing the deterministic row-id (hash of unique keys) for MOW and non-MOW UNIQUE_KEYS tables. The fix removes the early return and only uses the isExcludedTriggerTable flag to suppress AnalysisException for unsupported table types (AGG_KEYS etc.), while UNIQUE_KEYS tables always compute buildRowIdHash(keySlots) regardless of exclusion status. ### Release note None ### Check List (For Author) - Test: Unit Test (IvmNormalizeMtmvTest) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * [test](fe) Update IvmNormalizeMtmvTest for excluded MOW table row-id fix ### What problem does this PR solve? Issue Number: close #xxx Problem Summary: Update testExcludedMowTableUsesTransientRowId to expect deterministic hash-based row-id (Cast expression) instead of UuidNumeric for excluded MOW tables, matching the fix in buildRowId(). ### Release note None ### Check List (For Author) - Test: Unit Test (IvmNormalizeMtmvTest - 27 tests pass) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * [fix](fe) Refine ALTER MTMV refresh method compatibility rules ### What problem does this PR solve? Issue Number: close #xxx Problem Summary: The ALTER MTMV refresh method validation was too restrictive (blocking all changes to/from INCREMENTAL) or too permissive in some cases. The new rules: - COMPLETE <-> INCREMENTAL: forbidden (must recreate MV) - COMPLETE/INCREMENTAL -> AUTO: allowed - AUTO -> COMPLETE/INCREMENTAL: forbidden (must recreate MV) - Same method (no-op): allowed ### Release note None ### Check List (For Author) - Test: Regression test - Behavior changed: Yes (COMPLETE/INCREMENTAL to AUTO now allowed) - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * opt code * [test](fe) Isolate CreateMTMVCommandTest statement context ### What problem does this PR solve? Issue Number: None Related PR: None Problem Summary: CreateMTMVCommandTest reused the same StatementContext across multiple statements, so table and excluded_trigger_tables state leaked between test cases and caused false incremental MV validation failures. This change resets the statement context for each statement in the test and adds the missing LinkedHashSet import required by StatementContext. ### Release note None ### Check List (For Author) - Test: FE unit test - ./run-fe-ut.sh --run org.apache.doris.mtmv.ivm.IvmAggDeltaStrategyTest,org.apache.doris.mtmv.ivm.IvmDeltaExecutorTest,org.apache.doris.mtmv.ivm.IvmDeltaRewriterTest,org.apache.doris.mtmv.ivm.IvmRefreshManagerTest,org.apache.doris.mtmv.ivm.IvmSimpleScanDeltaStrategyTest,org.apache.doris.mtmv.ivm.IvmUtilTest,org.apache.doris.nereids.rules.rewrite.IvmNormalizeMtmvTest,org.apache.doris.nereids.trees.plans.CreateMTMVCommandTest,org.apache.doris.catalog.ShowCreateMTMVTest - Behavior changed: No - Does this need documentation: No * [fix](fe) Fix TableNameInfo imports after rebasing IVM branch ### What problem does this PR solve? Issue Number: None Related PR: None Problem Summary: Rebase onto the latest yujun777/ivm left several FE IVM files importing org.apache.doris.info.TableNameInfo while the current base branch provides the type in org.apache.doris.catalog.info.TableNameInfo. This fixes the stale imports so the rebased branch compiles and the targeted IVM FE tests run successfully again. ### Release note None ### Check List (For Author) - Test: Unit Test - org.apache.doris.mtmv.ivm.IvmAggDeltaStrategyTest - org.apache.doris.mtmv.ivm.IvmDeltaExecutorTest - org.apache.doris.mtmv.ivm.IvmDeltaRewriterTest - org.apache.doris.mtmv.ivm.IvmRefreshManagerTest - org.apache.doris.mtmv.ivm.IvmSimpleScanDeltaStrategyTest - org.apache.doris.mtmv.ivm.IvmUtilTest - org.apache.doris.nereids.rules.rewrite.IvmNormalizeMtmvTest - org.apache.doris.nereids.trees.plans.CreateMTMVCommandTest - org.apache.doris.catalog.ShowCreateMTMVTest - Behavior changed: No - Does this need documentation: No * [fix](fe) Use deterministic row id for excluded AGG_KEYS tables ### What problem does this PR solve? Issue Number: None Related PR: None Problem Summary: IVM normalization treated excluded AGG_KEYS base tables as transient-row-id scans and generated uuid_numeric() row ids. This changed excluded AGG_KEYS row identity across refreshes and did not follow the base table aggregate keys. Fix buildRowId to hash base-schema key columns for excluded AGG_KEYS tables, and add focused tests to verify the row id is deterministic and excludes non-key value columns. ### Release note None ### Check List (For Author) - Test: Unit Test - org.apache.doris.mtmv.ivm.IvmAggDeltaStrategyTest - org.apache.doris.mtmv.ivm.IvmDeltaExecutorTest - org.apache.doris.mtmv.ivm.IvmDeltaRewriterTest - org.apache.doris.mtmv.ivm.IvmRefreshManagerTest - org.apache.doris.mtmv.ivm.IvmSimpleScanDeltaStrategyTest - org.apache.doris.mtmv.ivm.IvmUtilTest - org.apache.doris.nereids.rules.rewrite.IvmNormalizeMtmvTest - org.apache.doris.nereids.trees.plans.CreateMTMVCommandTest - org.apache.doris.catalog.ShowCreateMTMVTest - Behavior changed: Yes (excluded AGG_KEYS row-id generation is now deterministic on agg keys) - Does this need documentation: No * fix comment * fix comment * opt code --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ages
### What problem does this PR solve?
Problem Summary:
1. Remove dead code in IvmNormalizeMtmv.buildRowId() — the isExcludedTriggerTable
branch at lines 503-505 was unreachable because all KeysType cases (UNIQUE_KEYS,
DUP_KEYS, AGG_KEYS) are already handled above it.
2. Improve the error message in AlterMTMVRefreshInfo.validateRefreshMethodCompat()
when attempting to alter the refresh method of an INCREMENTAL materialized view,
making it clearer that the operation is not allowed.
### Release note
None
### Check List (For Author)
- Test: Regression test / Unit Test
- FE UT: 94/94 IVM tests passed
- Regression: 10/10 IVM suites passed (mtmv_p0/ivm)
- Behavior changed: No
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve? Issue Number: close #xxx Problem Summary: Implements the core multi-bundle delta plan generation for IVM incremental refresh. This enables producing one delta command bundle per base table that has pending changes, with correct TSO snapshot binding across all scans in the normalized plan. Changes: - IvmStreamRef: Replaced streamType/consumerId/properties with consumedTso (persisted) and latestTso (transient). Added isUpToDate(). Deleted StreamType enum. - OlapTable: Added getVisibleTso() mock (delegates to getVisibleVersion). - LogicalOlapScan: Added tso (default -1) and isDelta (default false) fields, with withTso()/withIsDelta() methods. Both participate in equals(). - IvmDeltaRewriter: Complete rewrite with generateDeltaPlans() multi-bundle logic, rewriteOlapScans() helper, replaceWithDelta() mock. Uses rewriteDownShortCircuit + AtomicInteger for deterministic scan traversal. TSO binding: j<i → latestTso (v2), j>i → consumedTso (v1). Includes latestTso >= consumedTso invariant check. - IvmSimpleScanDeltaStrategy: isDelta check — non-delta scans skip dml_factor. - IvmAggDeltaStrategy: ctx made final, set via constructor (single-use). - IvmRefreshManager: Empty bundles = success (no-op, all tables up to date). - IvmDeltaRewriteContext: Added baseTableStreams field. ### Release note None ### Check List (For Author) - Test: Unit Test (109 IVM tests pass: 21 rewriter, 25 agg strategy, 14 simple strategy, 28 normalize, 7 util, 10 refresh manager, 4 executor) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… latestTso reading
### What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Implements Steps 6-7 of the multi-bundle IVM plan:
Step 6 - runningIvmRefresh crash recovery flag:
- Add runningIvmRefresh boolean to IvmInfo with @SerializedName("rr")
- Add ALTER_IVM_INFO editlog op type with full AlterMTMV persistence path
- In IvmRefreshManager: set flag=true before bundle execution, clear after
success with consumedTso advance in one atomic editlog write
- On failure: leave flag set so next task detects and falls back to COMPLETE
- In MTMVTask: detect flag on COMPLETE refresh entry, capture pre-refresh
TSOs, reset state after successful full refresh
- MTMV.alterIvmInfo() and getIvmInfo() use writeMvLock for thread safety
Step 7 - latestTso reading and baseTableStreams passing:
- populateLatestTso() reads OlapTable.getVisibleTso() for each base table
- ensureBaseTableStreamsInitialized() lazily populates from MTMV relation
metadata on first incremental refresh (handles empty map from MTMV creation)
- Pass baseTableStreams to IvmDeltaRewriteContext for TSO binding
- Guard advanceConsumedTso: only advance if latestTso >= consumedTso to
prevent regression when table resolution fails
### Release note
None
### Check List (For Author)
- Test: Unit Test (71 tests pass: IvmRefreshManager 17, IvmDeltaRewriter 21, IvmAggDeltaStrategy 25, AlterMTMV 8) + Regression test (10/10 IVM tests pass)
- Behavior changed: No
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve? Issue Number: close #xxx Problem Summary: 1. Added unit tests for IvmStreamRef (7 tests) and LogicalOlapScan tso/isDelta fields (5 tests) 2. Fixed populateLatestTso() to throw AnalysisException on failure instead of silently swallowing exceptions, preventing false no-op when TSO read fails on first refresh 3. Fixed captureBaseTableTsos() to return empty map on any failure, ensuring IVM state reset is safely skipped rather than partially applied 4. Added isEmpty() check in MTMVTask for captured TSOs ### Release note None ### Check List (For Author) - Test: Unit Test (86 tests pass across 6 IVM test files) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
run buildall |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)