Skip to content

Fix: [Redshift] interval and numeric column type parsing#27983

Open
mohittilala wants to merge 2 commits intomainfrom
fix/redshift-interval-precision-parsing
Open

Fix: [Redshift] interval and numeric column type parsing#27983
mohittilala wants to merge 2 commits intomainfrom
fix/redshift-interval-precision-parsing

Conversation

@mohittilala
Copy link
Copy Markdown
Contributor

Describe your changes:

Follow-up to #27524 (which Fixes #24106). Two more parsing failures in the same _get_args_and_kwargs function in ingestion/src/metadata/ingestion/source/database/redshift/utils.py, both surfaced when ingesting federated PostgreSQL schemas through Redshift:

  1. interval(N) columns raised INTERVAL.__init__() got multiple values for argument 'precision'. _init_args was extracting the precision into positional args while kwargs["precision"] was also being set on the same call. Fix: reset args = () in the interval branch so precision and fields flow through kwargs only. Same pattern as Fix Redshift timestamp precision parsing #27524 applies to time/timestamp.
  2. numeric(N) columns (precision-only, no scale) crashed with ValueError: not enough values to unpack (expected 2, got 1) because prec, scale = charlen.split(",") assumed a comma-separated charlen. Fix: parse charlen.split(",") into a tuple of one or two ints, reset args = () when no charlen is reported.

Both fixes mirror upstream SQLAlchemy's PostgreSQL dialect.

Tests in test_redshift_utils.py exercise the real _get_args_and_kwargs + _update_coltype path end-to-end (no over-mocking) for interval(6), interval day to second(6), bare interval, numeric(10), numeric(10,2) regression, and bare numeric.

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Bug fix

  • I have added a test that covers the exact scenario we are fixing. For complex issues, comment the issue number in the test for future reference.

@mohittilala mohittilala self-assigned this May 8, 2026
Copilot AI review requested due to automatic review settings May 8, 2026 08:15
@mohittilala mohittilala requested a review from a team as a code owner May 8, 2026 08:15
@mohittilala mohittilala added bug Something isn't working Ingestion safe to test Add this label to run secure Github workflows on PRs To release Will cherry-pick this PR into the release branch python Pull requests that update python code labels May 8, 2026
@mohittilala mohittilala moved this to In Progress 🏗️ in Shipping May 8, 2026
Comment thread ingestion/tests/unit/topology/database/test_redshift_utils.py
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Redshift ingestion column-type parser to better handle PostgreSQL-compatible type modifiers coming from federated schemas, specifically for interval and numeric types, and adds unit tests around these parsing paths.

Changes:

  • Adjust numeric argument parsing to support precision-only forms like numeric(10) without crashing.
  • Ensure interval precision/fields are routed via keyword arguments (not positional args) to avoid constructor-argument collisions.
  • Add unit tests covering interval(...) and numeric(...) parsing and SQLAlchemy type construction.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
ingestion/src/metadata/ingestion/source/database/redshift/utils.py Updates _get_args_and_kwargs to handle numeric precision-only and to clear positional args for interval types.
ingestion/tests/unit/topology/database/test_redshift_utils.py Adds end-to-end unit tests for _get_args_and_kwargs + _update_coltype for interval and numeric.

Comment thread ingestion/src/metadata/ingestion/source/database/redshift/utils.py
Comment thread ingestion/tests/unit/topology/database/test_redshift_utils.py
Comment thread ingestion/tests/unit/topology/database/test_redshift_utils.py
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@mohittilala mohittilala changed the title Fixes #24106: Redshift interval and numeric column type parsing Fix: [Redshift] interval and numeric column type parsing May 8, 2026
@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented May 8, 2026

Code Review ✅ Approved 1 resolved / 1 findings

Redshift ingestion now correctly handles interval and numeric column types by normalizing argument parsing and splitting logic. Extensive test coverage confirms support for various precision and scale configurations.

✅ 1 resolved
Quality: New tests use unittest.TestCase instead of plain pytest+assert

📄 ingestion/tests/unit/topology/database/test_redshift_utils.py:300-314
The custom review instructions specify: "Use pytest with assert (no unittest.TestCase). Use unittest.mock for external boundaries only." The new TestRedshiftIntervalParsing and TestRedshiftNumericParsing classes use unittest.TestCase with self.assertEqual/self.assertIsNone. While this is consistent with the existing tests in the same file, it diverges from the stated project convention. Consider using plain assert statements and standalone test functions (or at minimum plain assert inside the class) to align with the project's testing guidelines.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 8, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

🔴 Playwright Results — 8 failure(s), 21 flaky

✅ 3999 passed · ❌ 8 failed · 🟡 21 flaky · ⏭️ 86 skipped

Shard Passed Failed Flaky Skipped
🔴 Shard 1 296 2 1 4
🟡 Shard 2 753 0 2 8
🔴 Shard 3 749 4 6 7
🟡 Shard 4 787 0 3 18
🔴 Shard 5 682 2 3 41
🟡 Shard 6 732 0 6 8

Genuine Failures (failed on all attempts)

Features/CustomizeDetailPage.spec.ts › Dashboard - customization should work (shard 1)
�[31mTest timeout of 60000ms exceeded while setting up "userPage".�[39m
Features/CustomizeDetailPage.spec.ts › Domain - customize tab label should only render if it's customized by user (shard 1)
�[31m"beforeAll" hook timeout of 60000ms exceeded.�[39m
Features/Permission.spec.ts › Permissions (shard 3)
�[31m"beforeAll" hook timeout of 60000ms exceeded.�[39m
Features/Tasks/TaskNavigation.spec.ts › clicking task notification while on entity task tab refreshes the task list (shard 3)
Error: �[2mexpect(�[22m�[31mlocator�[39m�[2m).�[22mtoBeVisible�[2m(�[22m�[2m)�[22m failed

Locator: locator('.notification-box').locator('li.ant-list-item.notification-dropdown-list-btn').first()
Expected: visible
Timeout: 15000ms
Error: element(s) not found

Call log:
�[2m  - Expect "toBeVisible" with timeout 15000ms�[22m
�[2m  - waiting for locator('.notification-box').locator('li.ant-list-item.notification-dropdown-list-btn').first()�[22m

Features/Tasks/TaskNavigation.spec.ts › two sessions: admin on Columns tab creates task, assignee sees refresh on notification click (shard 3)
Error: �[2mexpect(�[22m�[31mlocator�[39m�[2m).�[22mtoBeVisible�[2m(�[22m�[2m)�[22m failed

Locator: locator('.notification-box').locator('li.ant-list-item.notification-dropdown-list-btn').first()
Expected: visible
Timeout: 15000ms
Error: element(s) not found

Call log:
�[2m  - Expect "toBeVisible" with timeout 15000ms�[22m
�[2m  - waiting for locator('.notification-box').locator('li.ant-list-item.notification-dropdown-list-btn').first()�[22m

Flow/ObservabilityAlerts.spec.ts › Alert operations for a user with and without permissions (shard 3)
�[31mTest timeout of 60000ms exceeded.�[39m
Pages/Entity.spec.ts › DashboardDataModel page should show the project name (shard 5)
�[31m"beforeAll" hook timeout of 60000ms exceeded.�[39m
Pages/Entity.spec.ts › Tier Add, Update and Remove (shard 5)
Error: �[2mexpect(�[22m�[31mlocator�[39m�[2m).�[22mtoContainText�[2m(�[22m�[32mexpected�[39m�[2m)�[22m failed

Locator: getByTestId('Tier')
Expected substring: �[32m"Tier�[7m3�[27m"�[39m
Received string:    �[31m"Tier�[7m1�[27m"�[39m
Timeout: 15000ms

Call log:
�[2m  - Expect "toContainText" with timeout 15000ms�[22m
�[2m  - waiting for getByTestId('Tier')�[22m
�[2m    18 × locator resolved to <span data-testid="Tier" class="ant-tag tag-chip tag-chip-content cursor-pointer">…</span>�[22m
�[2m       - unexpected value "Tier1"�[22m

🟡 21 flaky test(s) (passed on retry)
  • Features/CustomizeDetailPage.spec.ts › Stored Procedure - customization should work (shard 1, 2 retries)
  • Features/ActivityAPI.spec.ts › Activity event shows the actor who made the change (shard 2, 2 retries)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/Permissions/DomainPermissions.spec.ts › Domain allow operations (shard 3, 2 retries)
  • Features/Permissions/DomainPermissions.spec.ts › Domain deny operations (shard 3, 1 retry)
  • Features/QueryEntity.spec.ts › Verify Query Pagination (shard 3, 1 retry)
  • Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
  • Features/Table.spec.ts › Table pagination with sorting should works (shard 3, 1 retry)
  • Features/Table.spec.ts › Tags term should be consistent for search (shard 3, 1 retry)
  • Pages/CustomProperties.spec.ts › Should search custom properties for topic in right panel (shard 4, 1 retry)
  • Pages/CustomProperties.spec.ts › Should clear search and show all properties for pipeline in right panel (shard 4, 1 retry)
  • Pages/DataContracts.spec.ts › Contract Status badge should be visible on condition if Contract Tab is present/hidden by Persona (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Tier Add, Update and Remove (shard 5, 2 retries)
  • Pages/Entity.spec.ts › Domain Propagation (shard 5, 1 retry)
  • Pages/ExplorePageRightPanel.spec.ts › Should verify deleted user not visible in owner selection for pipeline (shard 5, 1 retry)
  • Pages/ExplorePageRightPanel.spec.ts › Should allow Data Consumer to edit glossary terms for dashboardDataModel (shard 6, 2 retries)
  • Pages/ExplorePageRightPanel.spec.ts › Should NOT allow Data Consumer to edit owners when entity has owner (shard 6, 1 retry)
  • Pages/ExplorePageRightPanel.spec.ts › Should clear description for dashboard (shard 6, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
  • Pages/Teams.spec.ts › Add and Remove User for Team (shard 6, 1 retry)
  • Pages/Users.spec.ts › Create and Delete user (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Ingestion python Pull requests that update python code safe to test Add this label to run secure Github workflows on PRs To release Will cherry-pick this PR into the release branch

Projects

Status: In Progress 🏗️

Development

Successfully merging this pull request may close these issues.

Redshift ingestion fails with "TIMESTAMP.__init__() got multiple values for argument 'timezone'" on timestamp(0) without time zone columns

2 participants