Skip to content

feat(databricks): add Databricks Model Serving + AI Gateway provider#26510

Open
dgokeeffe wants to merge 2 commits intoanomalyco:devfrom
dgokeeffe:feat/databricks-provider
Open

feat(databricks): add Databricks Model Serving + AI Gateway provider#26510
dgokeeffe wants to merge 2 commits intoanomalyco:devfrom
dgokeeffe:feat/databricks-provider

Conversation

@dgokeeffe
Copy link
Copy Markdown

Issue for this PR

Closes #7983

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

Adds databricks as a custom provider with auto-discovery across both surfaces a Databricks workspace can expose:

  • Model Serving (/serving-endpoints)
  • AI Gateway v2 (/ai-gateway)

At init the provider probes /ai-gateway/anthropic/v1/models. If reachable it routes Claude / Gemini / GPT through the official AI-SDK adapters at AI Gateway URLs. If the probe 404s it falls back to /serving-endpoints and routes GPT through @databricks/ai-sdk-provider's bundled Responses adapter. Can be forced via provider.databricks.options.surface = "auto" | "ai-gateway" | "model-serving".

Auth uses @databricks/sdk-experimental's standard chain (profile / OAuth / PAT / env). Token refresh is per-request — no background timer.

Three Databricks-side quirks are patched client-side and gated so they go inert when not needed:

  1. Tool-schema validator rejects JSON Schemas without an explicit type: "object" wrapper. databricksFetch patches outgoing schemas before send. No-op when the SDK already wraps them.
  2. Oversized item IDs on the Responses path. Databricks emits IDs up to ~192 chars; OpenAI's Responses backend caps at 64. The SSE patcher truncates id / item_id / call_id deterministically and strips oversized providerOptions.{databricks,openai}.itemId on outgoing assistant messages. GPT only.
  3. Mismatched item IDs across response.output_item.added and response.content_part.added / output_text.delta for the same output_index on AI Gateway. Patcher tracks the canonical id per output_index and rewrites mismatched item_ids on dependent events. GPT/Responses only.

session/llm.ts synthesises a tool-input-start chunk before each tool-call and dedupes the bundled provider's flush() re-emit by toolCallId. Gated on npm === "@databricks/ai-sdk-provider" so it stays inert when GPT goes through @ai-sdk/openai on the AI Gateway path. Marked DELETE-WHEN with the exact upstream conditions for retirement.

bun.lock is intentionally not in this PR — bun install after merge picks up @databricks/ai-sdk-provider@0.5.0 and @databricks/sdk-experimental@0.16.0 from the catalog.

How did you verify your code works?

End-to-end via test-databricks-3-classes.sh (full opencode agent loop) against both surfaces with Claude Sonnet 4.6, GPT-5.5, and Gemini 2.5 Pro: 3/3 pass on both.

To reproduce:

  1. bun install
  2. Configure a Databricks profile with serving-endpoints access
  3. ./test-databricks-3-classes.sh — should pass 3/3
  4. Repeat against a workspace with AI Gateway v2 enabled — provider auto-detects gateway URLs

A SDK-level baseline at packages/opencode/script/test-databricks-3-classes.ts exercises @ai-sdk/openai directly against /serving-endpoints, useful for isolating opencode-stack vs raw-API issues.

Screenshots / recordings

N/A — no UI changes.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

David O'Keeffe added 2 commits May 9, 2026 16:55
Re-synced onto current upstream/dev. Brings the Databricks integration as
a single, cleanly-organised commit on top of upstream:

- packages/opencode/src/auth/index.ts: add DatabricksProfile auth class
  (databricks-profile type) to the Auth.Info union for profile-based
  Databricks SDK auth.

- packages/opencode/src/provider/provider.ts: register `databricks` as a
  custom provider. Auto-discovers serving-endpoints, classifies model
  family (Claude / GPT / Gemini / Llama / Qwen / Gemma / Codex), routes
  per-family through @ai-sdk/anthropic, @ai-sdk/google, @ai-sdk/openai or
  the bundled @databricks/ai-sdk-provider as appropriate. Probes
  /ai-gateway/anthropic/v1/models at startup; if 200, uses AI Gateway
  URLs (anthropic/v1, gemini/v1beta, codex/v1); else falls back to
  /serving-endpoints. Override via provider.databricks.options.surface =
  "auto" | "ai-gateway" | "model-serving" (default "auto"). Per-request
  Databricks SDK auth via dbConfig.authenticate(headers) handles OAuth
  token refresh transparently — no background thread needed.

  SSE patcher for the OpenAI Responses path (used by GPT on either
  surface) addresses two server-side quirks: (a) item IDs up to ~192
  chars (OpenAI Responses backend caps at 64) — truncate id/item_id/
  call_id deterministically; (b) AI Gateway emits
  response.output_item.added with one item id and the subsequent
  response.content_part.added/output_text.delta with a different item_id
  for the same output_index — track the canonical id per output_index
  and rewrite mismatched item_ids on dependent events. Also tool-schema
  type:object wrapper patching for the proxy's strict validator.

  Optional outgoing-body / incoming-SSE workaround disable via
  DATABRICKS_BARE_FETCH=1 env var; useful for verifying which patches
  remain load-bearing on a given surface.

  AI Gateway path drops the per-endpoint maxTools cap on gpt/codex
  family (89 tools verified accepted server-side); model-serving keeps
  the cap to avoid the 89-tool rejection.

- packages/opencode/src/provider/transform.ts: strip oversized itemIds
  from outgoing assistant messages on Responses paths. Handles both
  providerOptions.databricks.itemId (model-serving via bundled provider)
  and providerOptions.openai.itemId (AI Gateway via @ai-sdk/openai).

- packages/opencode/src/session/llm.ts: middleware that synthesizes a
  tool-input-start chunk before each tool-call and dedupes the bundled
  provider's flush() re-emit by toolCallId. Gated on
  npm === "@databricks/ai-sdk-provider" so it stays inert when GPT goes
  through @ai-sdk/openai on the AI Gateway path.

  DELETE-WHEN: drop this middleware once @databricks/ai-sdk-provider
  ships a release that (a) emits the AI-SDK v3 tool-streaming lifecycle
  on its Responses path and (b) stops setting providerExecuted: true in
  flush() when useRemoteToolCalling is false. Currently broken in 0.5.0.

- package.json (catalog): @databricks/ai-sdk-provider 0.5.0 and
  @databricks/sdk-experimental 0.16.0. @opentui/{core,solid} pinned to
  0.2.0 (latest available on the Databricks internal npm proxy).

- .gitignore: keep local-only notes/reproductions/handover docs out of
  the public fork.

Verified: test-databricks-3-classes.sh passes 3/3 against logfood
(model-serving) and aigw (ai-gateway) workspaces — Claude, GPT-5.5,
Gemini all complete tool-use roundtrips on both surfaces.

Co-authored-by: Isaac
- test-databricks-3-classes.sh: drives the full opencode HTTP API end-to-end
  against three model families (Claude / GPT / Gemini) on whatever Databricks
  surface is configured. Tests basic response and tool-call execution.

- packages/opencode/script/test-databricks-3-classes.ts: SDK-level baseline
  that uses @ai-sdk/openai directly against /serving-endpoints. Useful for
  isolating opencode-stack vs raw-API issues during debugging — but note that
  on AI Gateway it bypasses @databricks/ai-sdk-provider's chunk emission, so
  a passing run here does NOT validate the bundled provider's path.

- .gitignore: ignore models-snapshot.ts (auto-generated by build.ts).

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for Databricks Foundation Model APIs provider

1 participant