Skip to content

feat(providers): add DeepInfra as a built-in inference provider#1773

Closed
mmilutinovic371 wants to merge 1 commit into
NVIDIA:mainfrom
mmilutinovic371:feat/deepinfra-provider
Closed

feat(providers): add DeepInfra as a built-in inference provider#1773
mmilutinovic371 wants to merge 1 commit into
NVIDIA:mainfrom
mmilutinovic371:feat/deepinfra-provider

Conversation

@mmilutinovic371

@mmilutinovic371 mmilutinovic371 commented Jun 5, 2026

Copy link
Copy Markdown

Summary

DeepInfra is one of the top open source LLM providers and a perfect fit for agent frameworks with its low cost and high performance. This PR promotes it from a documented workaround to a core built-in provider in OpenShell.

  • Adds deepinfra as a built-in inference provider alongside nvidia, openai, and anthropic
  • DEEPINFRA_API_KEY is now discovered automatically via --from-existing
  • openshell provider list-profiles shows DeepInfra in the INFERENCE section
  • Fixes build_backend_url to correctly strip /v1 from request paths when the provider base URL contains /v1/ as an internal path segment (e.g. https://api.deepinfra.com/v1/openai) — without this fix, requests were routed to .../v1/openai/v1/chat/completions (404) instead of .../v1/openai/chat/completions

Related Issue

N/A

Changes

  • providers/deepinfra.yaml — new built-in profile (inference category, api.deepinfra.com:443, Bearer auth, DEEPINFRA_API_KEY)
  • crates/openshell-core/src/inference.rsDEEPINFRA_PROFILE, normalization, profile_for entries + tests
  • crates/openshell-providers/src/providers/deepinfra.rs — discovery spec + env-var test
  • crates/openshell-providers/src/{lib,profiles,providers/mod}.rs — registration (alphabetical module order)
  • crates/openshell-router/src/backend.rs — URL construction fix + test
  • docs/sandboxes/providers-v2.mdx, docs/sandboxes/manage-providers.mdx — DeepInfra rows

Testing

  • mise run pre-commit passes (rust, helm, markdown, license; python:proto is a pre-existing failure unrelated to this PR)
  • 262 Rust unit tests pass across openshell-core, openshell-providers, openshell-router (cargo test -p openshell-core -p openshell-providers -p openshell-router)
  • openshell provider list-profiles shows deepinfra in INFERENCE section
  • openshell provider create --name di --type deepinfra --from-existing discovers DEEPINFRA_API_KEY
  • openshell inference set --provider di --model <model> --no-verify configures route
  • curl https://inference.local/v1/chat/completions from inside sandbox returns a valid completion from DeepInfra

Unit test results

test result: ok. 164 passed; 0 failed; 0 ignored  (openshell-core)
test result: ok. 37 passed;  0 failed; 0 ignored  (openshell-providers)
test result: ok. 44 passed;  0 failed; 0 ignored  (openshell-router)
test result: ok. 17 passed;  0 failed; 0 ignored  (openshell-router integration)

Includes inference::tests::profile_for_deepinfra, providers::deepinfra::tests::discovers_deepinfra_env_credentials, and backend::tests::build_backend_url_dedupes_v1_for_base_with_v1_subpath.

Provider list-profiles

INFERENCE
  deepinfra         DeepInfra         endpoints: 1  inference
  google-vertex-ai  Google Vertex AI  endpoints: 4  inference
  nvidia            NVIDIA            endpoints: 1  inference

End-to-end inference from inside sandbox

$ curl -s https://inference.local/v1/chat/completions --insecure \
    -H "Content-Type: application/json" \
    -d '{"model":"Qwen/Qwen3-30B-A3B","messages":[{"role":"user","content":"Say hello"}],"max_tokens":50}'

{"id":"chatcmpl-RvC46ezaN8prTxquYHZZJLMX","object":"chat.completion","model":"Qwen/Qwen3-30B-A3B",
 "choices":[{"message":{"role":"assistant","content":"<think>\nOkay, the user said \"Say hello.\" ..."}}],
 "usage":{"prompt_tokens":10,"total_tokens":60,"completion_tokens":50,"estimated_cost":2.34e-05}}

Screenshots / Logs

Screenshot 2026-06-05 at 15 55 32 Screenshot 2026-06-05 at 15 55 53 Screenshot 2026-06-05 at 15 56 28 Screenshot 2026-06-05 at 15 56 39 Screenshot 2026-06-05 at 15 56 53 Screenshot 2026-06-05 at 15 57 45 Screenshot 2026-06-05 at 15 58 53

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (docs/sandboxes/providers-v2.mdx, docs/sandboxes/manage-providers.mdx)

- Add providers/deepinfra.yaml profile (category: inference, endpoint:
  api.deepinfra.com:443, credential: DEEPINFRA_API_KEY)
- Register profile in BUILT_IN_PROFILE_YAMLS
- Add ProviderDiscoverySpec for DEEPINFRA_API_KEY env-var discovery
- Add DEEPINFRA_PROFILE to openshell-core inference profiles
  (base URL: https://api.deepinfra.com/v1/openai, Bearer auth,
  OpenAI-compatible protocols)
- Fix build_backend_url to strip /v1 prefix from request path when the
  base URL contains /v1/ as an internal segment, not just when it ends
  with /v1; this prevents URL doubling for providers like DeepInfra
  whose base URL is already rooted under /v1/openai
- Update providers-v2 and manage-providers docs with DeepInfra rows
@copy-pr-bot

copy-pr-bot Bot commented Jun 5, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions

github-actions Bot commented Jun 5, 2026

Copy link
Copy Markdown

All contributors have signed the DCO ✍️ ✅
Posted by the DCO Assistant Lite bot.

@github-actions

github-actions Bot commented Jun 5, 2026

Copy link
Copy Markdown

Thank you for your interest in contributing to OpenShell, @mmilutinovic371.

This project uses a vouch system for first-time contributors. Before submitting a pull request, you need to be vouched by a maintainer.

To get vouched:

  1. Open a Vouch Request discussion.
  2. Describe what you want to change and why.
  3. Write in your own words — do not have an AI generate the request.
  4. A maintainer will comment /vouch if approved.
  5. Once vouched, open a new PR (preferred) or reopen this one after a few minutes.

See CONTRIBUTING.md for details.

@github-actions github-actions Bot closed this Jun 5, 2026
@mmilutinovic371

Copy link
Copy Markdown
Author

I have read the DCO document and I hereby sign the DCO.

@mmilutinovic371

Copy link
Copy Markdown
Author

recheck

Comment on lines +1 to +15
// SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// SPDX-License-Identifier: Apache-2.0

use crate::ProviderDiscoverySpec;

pub const SPEC: ProviderDiscoverySpec = ProviderDiscoverySpec {
id: "deepinfra",
credential_env_vars: &["DEEPINFRA_API_KEY"],
};

test_discovers_env_credential!(
discovers_deepinfra_env_credentials,
"DEEPINFRA_API_KEY",
"di-test123"
);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this, we will only support providers v2.

@johntmyers

Copy link
Copy Markdown
Collaborator

Please re-open the PR. Also please update the PR to only support Providers v2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants