Skip to content

add nlq to OpenMetadataApplicationConfig#27988

Open
lautel wants to merge 5 commits intomainfrom
add-nlq-config-settings
Open

add nlq to OpenMetadataApplicationConfig#27988
lautel wants to merge 5 commits intomainfrom
add-nlq-config-settings

Conversation

@lautel
Copy link
Copy Markdown
Contributor

@lautel lautel commented May 8, 2026


Summary by Gitar

  • Configuration updates:
    • Renamed maxConcurrentEmbeddingRequests to maxConcurrentRequests in openmetadata.yaml and service logic.
  • Schema enhancements:
    • Added model tuning parameters (timeoutSeconds, maxTokens, temperature) for bedrock and openai configurations.
    • Added filterExtractor for NLQ cache tuning and hybridSearch settings for OpenSearch scoring and pagination parameters.
  • Client updates:
    • Updated OpenAIEmbeddingClient to use maxConcurrentRequests from NaturalLanguageSearchConfiguration during initialization.
  • Application core:
    • Added nlqHybridSearch field to OpenMetadataApplicationConfig to support new NLQ configurations.
  • Generated types:
    • Updated elasticSearchConfiguration.ts and settings.ts in the UI to reflect new NLQ and configuration schema changes.

This will update automatically on new commits.

@lautel lautel self-assigned this May 8, 2026
@lautel lautel added the safe to test Add this label to run secure Github workflows on PRs label May 8, 2026
dimension,
endpoint,
isAzure,
new NaturalLanguageSearchConfiguration().getMaxConcurrentRequests());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Quality: Instantiating config object just to read a default value

The change replaces DEFAULT_MAX_CONCURRENT_REQUESTS (a clear constant with value 10) with new NaturalLanguageSearchConfiguration().getMaxConcurrentRequests(). This instantiates a full configuration POJO solely to retrieve a JSON-schema default value. This is less readable and less efficient than referencing the existing constant in EmbeddingClient. The parent class EmbeddingClient already has resolveMaxConcurrent() which falls back to DEFAULT_MAX_CONCURRENT_REQUESTS when the config value is null/non-positive, so the indirection through a fresh config object is unnecessary.

Suggested fix:

Replace with the existing constant:
    this(
        httpClient,
        apiKey,
        modelId,
        dimension,
        endpoint,
        isAzure,
        DEFAULT_MAX_CONCURRENT_REQUESTS);

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

✅ TypeScript Types Auto-Updated

The generated TypeScript types have been automatically updated based on JSON schema changes in this PR.

@github-actions github-actions Bot requested a review from a team as a code owner May 8, 2026 13:53
@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented May 8, 2026

Code Review 👍 Approved with suggestions 1 resolved / 2 findings

Integrates NLQ configuration into the application core and updates client settings for Bedrock and OpenAI providers. Refactor the instantiation of NaturalLanguageSearchConfiguration to avoid object creation when retrieving default values.

💡 Quality: Instantiating config object just to read a default value

📄 openmetadata-service/src/main/java/org/openmetadata/service/search/vector/client/OpenAIEmbeddingClient.java:88

The change replaces DEFAULT_MAX_CONCURRENT_REQUESTS (a clear constant with value 10) with new NaturalLanguageSearchConfiguration().getMaxConcurrentRequests(). This instantiates a full configuration POJO solely to retrieve a JSON-schema default value. This is less readable and less efficient than referencing the existing constant in EmbeddingClient. The parent class EmbeddingClient already has resolveMaxConcurrent() which falls back to DEFAULT_MAX_CONCURRENT_REQUESTS when the config value is null/non-positive, so the indirection through a fresh config object is unnecessary.

Suggested fix
Replace with the existing constant:
    this(
        httpClient,
        apiKey,
        modelId,
        dimension,
        endpoint,
        isAzure,
        DEFAULT_MAX_CONCURRENT_REQUESTS);
✅ 1 resolved
Quality: Untyped JsonNode config loses validation and discoverability

📄 openmetadata-service/src/main/java/org/openmetadata/service/OpenMetadataApplicationConfig.java:83-84
The nlqHybridSearch field uses raw JsonNode, which bypasses Dropwizard's @Valid/@NotNull constraint propagation and provides no compile-time contract for consumers. If the structure is known (even partially), a dedicated configuration record/class would enable validation at startup, self-document expected keys, and align with the typed pattern used by every other field in this class (e.g., ElasticSearchConfiguration, AuthenticationConfiguration).

🤖 Prompt for agents
Code Review: Integrates NLQ configuration into the application core and updates client settings for Bedrock and OpenAI providers. Refactor the instantiation of NaturalLanguageSearchConfiguration to avoid object creation when retrieving default values.

1. 💡 Quality: Instantiating config object just to read a default value
   Files: openmetadata-service/src/main/java/org/openmetadata/service/search/vector/client/OpenAIEmbeddingClient.java:88

   The change replaces `DEFAULT_MAX_CONCURRENT_REQUESTS` (a clear constant with value 10) with `new NaturalLanguageSearchConfiguration().getMaxConcurrentRequests()`. This instantiates a full configuration POJO solely to retrieve a JSON-schema default value. This is less readable and less efficient than referencing the existing constant in `EmbeddingClient`. The parent class `EmbeddingClient` already has `resolveMaxConcurrent()` which falls back to `DEFAULT_MAX_CONCURRENT_REQUESTS` when the config value is null/non-positive, so the indirection through a fresh config object is unnecessary.

   Suggested fix:
   Replace with the existing constant:
       this(
           httpClient,
           apiKey,
           modelId,
           dimension,
           endpoint,
           isAzure,
           DEFAULT_MAX_CONCURRENT_REQUESTS);

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 8, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

🟡 Playwright Results — all passed (15 flaky)

✅ 4013 passed · ❌ 0 failed · 🟡 15 flaky · ⏭️ 86 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 298 0 1 4
🟡 Shard 2 750 0 5 8
🟡 Shard 3 755 0 4 7
🟡 Shard 4 788 0 2 18
✅ Shard 5 687 0 0 41
🟡 Shard 6 735 0 3 8
🟡 15 flaky test(s) (passed on retry)
  • Pages/AuditLogs.spec.ts › should apply both User and EntityType filters simultaneously (shard 1, 1 retry)
  • Features/ActivityAPI.spec.ts › Activity event shows the actor who made the change (shard 2, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/Glossary/GlossaryHierarchy.spec.ts › should cancel move operation (shard 2, 1 retry)
  • Features/Glossary/GlossaryP3Tests.spec.ts › should handle multiple rapid API calls (shard 2, 1 retry)
  • Features/Glossary/GlossaryWorkflow.spec.ts › should start term as Draft when glossary has reviewers (shard 2, 2 retries)
  • Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
  • Flow/IngestionBot.spec.ts › Ingestion bot should be able to access domain specific domain (shard 3, 1 retry)
  • Flow/NotificationAlerts.spec.ts › Conversation source alert (shard 3, 1 retry)
  • Flow/ObservabilityAlerts.spec.ts › Alert operations for a user with and without permissions (shard 3, 2 retries)
  • Pages/CustomProperties.spec.ts › Should verify property name is visible for apiCollection in right panel (shard 4, 1 retry)
  • Pages/CustomProperties.spec.ts › Table (shard 4, 1 retry)
  • Pages/Lineage/DataAssetLineage.spec.ts › verify create lineage for entity - Pipeline (shard 6, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
  • Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab IS visible for supported type: searchIndex (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant