OpenHands · enyst · Jun 9, 2026 · Jun 10, 2026
@@ -3,14 +3,14 @@
 description: Understanding the OpenHands Software Agent SDK's package structure, component interactions, and execution models.
 ---

 The **OpenHands Software Agent SDK** provides a unified, type-safe framework for building and deploying AI agents—from local experiments to full production systems, focused on **statelessness**, **composability**, and **clear boundaries** between research and deployment.

 Check [this document](/sdk/arch/design) for the core design principles that guided its architecture.

 ## Relationship with OpenHands Applications

 The Software Agent SDK serves as the **source of truth for agents** in OpenHands. The [OpenHands repository](https://github.com/OpenHands/OpenHands) provides interfaces—web app, CLI, and cloud—that consume the SDK APIs. This architecture ensures consistency and enables flexible integration patterns.
 - **Software Agent SDK = foundation.** The SDK defines all core components: agents, LLMs, conversations, tools, workspaces, events, and security policies.
 - **Interfaces reuse SDK objects.** The OpenHands GUI or CLI hydrate SDK components from persisted settings and orchestrate execution through SDK APIs.
 - **Consistent configuration.** Whether you launch an agent programmatically or via the OpenHands GUI, the supported parameters and defaults come from the SDK.

@@ -89,7 +89,7 @@
 - Perfect for prototyping and simple use cases
 - Quick setup, no Docker required

 #### Mode 2: Production / Sandboxed

 **Installation:** Install all 4 packages

@@ -131,7 +131,7 @@
 ```

 - `RemoteWorkspace` auto-spawns agent-server in containers
 - Sandboxed execution for security
 - Multi-user deployments
 - Distributed systems (e.g., Kubernetes) support

@@ -154,7 +154,7 @@
 - **[Condenser](/sdk/arch/condenser):** Conversation history compression for token management
 - **[Security](/sdk/arch/security):** Action risk assessment and validation before execution

 **Design:** Stateless, immutable components with type-safe Pydantic models.

 **Self-Contained:** Build and run agents with just `openhands-sdk` using `LocalWorkspace`.

@@ -184,7 +184,7 @@

 **Design:** All workspace implementations extend `RemoteWorkspace` from SDK, adding container lifecycle or API client functionality.

 **Use Cases:** Sandboxed execution, multi-user deployments, production environments.

 <Note>
 For full list of implemented workspaces, see the [source code](https://github.com/OpenHands/software-agent-sdk/tree/main/openhands-workspace).
@@ -196,6 +196,7 @@
 
 **Features:**
 - REST API & WebSocket endpoints for conversations, bash, files, events, desktop, and VSCode
+- [OpenAI-compatible `/v1/chat/completions` endpoint](/sdk/guides/agent-server/openai-gateway) for clients that expect an OpenAI-style backend
 - Service management with isolated per-user sessions
 - API key authentication and health checking
 

@@ -1,30 +1,214 @@
 ---
-title: OpenAI-Compatible Gateway
+title: OpenAI-Compatible Endpoint
 description: Call an OpenHands agent-server through the OpenAI Chat Completions protocol.
 ---
 
 import RunExampleCode from "/sdk/shared-snippets/how-to-run-example.mdx";
 
 The agent-server exposes an OpenAI-compatible `/v1/chat/completions` endpoint so clients that already speak the OpenAI protocol can call an OpenHands agent.
 
-Use this when you want an existing chat UI, IDE integration, evaluation harness, or another agent to treat OpenHands as an OpenAI-style backend while still getting the full agent runtime behind the request.
+Use this when you want an existing chat UI, IDE integration, evaluation harness, voice platform, or another agent to treat OpenHands as an OpenAI-style backend while still getting the full agent runtime behind the request.
 
-## How it works
+## What to Configure
 
-1. Save an LLM profile through the agent-server profile API.
-2. List available gateway models with `GET /v1/models`.
-3. Call `POST /v1/chat/completions` with a model ID shaped like `openhands_<profile_name>`.
-4. Read `X-OpenHands-ServerConversation-ID` from the response.
-5. Pass that header back on later requests to continue the same OpenHands conversation.
+Most OpenAI-compatible clients ask for the same three fields:
+
+| Client Field | Value |
+| --- | --- |
+| Base URL | `https://YOUR_AGENT_SERVER/v1` |
+| API key | Your agent-server session API key |
+| Model | `openhands_<profile_name>` |
+
+For example, a saved LLM profile named `gateway_demo` appears as the OpenAI model `openhands_gateway_demo`.
 
 The gateway accepts the same session key in either OpenHands or OpenAI-compatible form:
 
 - `X-Session-API-Key: <key>`
 - `Authorization: Bearer <key>`
 
-<Note>
-The current gateway supports non-streaming Chat Completions requests. Requests with `stream: true` return a `400` response until streaming support is added.
-</Note>
+## Prepare a Profile
+
+OpenAI-compatible traffic is backed by an agent-server LLM profile. Create one with the native profile API first:
+
+```bash
+export AGENT_SERVER_URL="http://localhost:8000"
+export SESSION_API_KEY="your-session-api-key"
+export PROFILE_NAME="gateway_demo"
+export OPENHANDS_MODEL="openhands_${PROFILE_NAME}"
+
+curl -X POST "$AGENT_SERVER_URL/api/profiles/$PROFILE_NAME" \
+  -H "X-Session-API-Key: $SESSION_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "llm": {
+      "model": "gpt-5-nano",
+      "api_key": "YOUR_LLM_API_KEY"
+    },
+    "include_secrets": true
+  }'
+```
+
+Then confirm the profile is visible to OpenAI clients:
+
+```bash
+curl "$AGENT_SERVER_URL/v1/models" \
+  -H "Authorization: Bearer $SESSION_API_KEY"
+```
+
+## Client Recipes
+
+<Tabs>
+<Tab title="curl">
+
+```bash
+curl -i "$AGENT_SERVER_URL/v1/chat/completions" \
+  -H "Authorization: Bearer $SESSION_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d "{
+    \"model\": \"$OPENHANDS_MODEL\",
+    \"messages\": [
+      {
+        \"role\": \"system\",
+        \"content\": \"Answer directly unless you need to inspect files.\"
+      },
+      {
+        \"role\": \"user\",
+        \"content\": \"Explain what this OpenHands endpoint does in one sentence.\"
+      }
+    ]
+  }"
+```
+
+The response includes `X-OpenHands-ServerConversation-ID`. Save that header if you want a later request to continue the same agent conversation.
+
+</Tab>
+<Tab title="Python SDK">
+
+```python
+import os
+
+from openai import OpenAI
+
+client = OpenAI(
+    api_key=os.environ["SESSION_API_KEY"],
+    base_url=f"{os.environ['AGENT_SERVER_URL']}/v1",
+)
+
+response = client.chat.completions.with_raw_response.create(
+    model=os.environ["OPENHANDS_MODEL"],
+    messages=[
+        {"role": "user", "content": "Summarize this repository."},
+    ],
+)
+completion = response.parse()
+conversation_id = response.headers["X-OpenHands-ServerConversation-ID"]
+print(completion.choices[0].message.content)
+
+follow_up = client.chat.completions.create(
+    model=os.environ["OPENHANDS_MODEL"],
+    messages=[{"role": "user", "content": "Now list the main packages."}],
+    extra_headers={"X-OpenHands-ServerConversation-ID": conversation_id},
+)
+print(follow_up.choices[0].message.content)
+```
+
+</Tab>
+<Tab title="JavaScript SDK">
+
+```javascript
+import OpenAI from "openai";
+
+const client = new OpenAI({
+  apiKey: process.env.SESSION_API_KEY,
+  baseURL: `${process.env.AGENT_SERVER_URL}/v1`,
+});
+
+const first = await client.chat.completions
+  .create({
+    model: process.env.OPENHANDS_MODEL,
+    messages: [
+      { role: "user", content: "Summarize this repository." },
+    ],
+  })
+  .withResponse();
+
+const conversationId = first.response.headers.get(
+  "x-openhands-serverconversation-id",
+);
+console.log(first.data.choices[0].message.content);
+
+const followUp = await client.chat.completions.create(
+  {
+    model: process.env.OPENHANDS_MODEL,
+    messages: [{ role: "user", content: "Now list the main packages." }],
+  },
+  {
+    headers: { "X-OpenHands-ServerConversation-ID": conversationId },
+  },
+);
+console.log(followUp.choices[0].message.content);
+```
+
+</Tab>
+<Tab title="Chat UIs">
+
+For Open WebUI, LibreChat, Chatbot UI, and similar OpenAI-compatible frontends, configure a custom OpenAI provider with:
+
+- **Base URL**: `https://YOUR_AGENT_SERVER/v1`
+- **API key**: your agent-server session API key
+- **Model**: `openhands_<profile_name>`
+- **Streaming**: disabled for now
+
+If the UI can store a response header and send a custom request header, persist `X-OpenHands-ServerConversation-ID` per chat thread and send it on follow-up turns. If it cannot, each request starts a new OpenHands conversation and works best for one-shot tasks.
+
+</Tab>
+<Tab title="Voice or Webhook">
+
+Voice platforms and webhook integrations usually have their own session or call ID. Store a mapping from that external ID to the OpenHands conversation ID:
+
+```python
+import os
+
+# Initialize this once at app startup, or replace it with durable session storage.
+conversation_ids: dict[str, str] = {}
+
+conversation_id = conversation_ids.get(platform_session_id)
+headers = {}
+if conversation_id:
+    headers["X-OpenHands-ServerConversation-ID"] = conversation_id
+
+response = client.chat.completions.with_raw_response.create(
+    model=os.environ.get("OPENHANDS_MODEL", "openhands_gateway_demo"),
+    messages=[{"role": "user", "content": transcript_text}],
+    extra_headers=headers,
+)
+
+conversation_ids[platform_session_id] = response.headers[
+    "X-OpenHands-ServerConversation-ID"
+]
+reply_text = response.parse().choices[0].message.content
+```
+
+Return `reply_text` to the voice or webhook platform. Keep the mapping for as long as that external session should continue.
+
+</Tab>
+</Tabs>
+
+## Conversation State
+
+The OpenAI Chat Completions protocol usually sends full message history on every request. The OpenHands gateway does not reconstruct agent history from prior assistant messages. Instead:
+
+- Omit `X-OpenHands-ServerConversation-ID` to start a new OpenHands conversation.
+- Read `X-OpenHands-ServerConversation-ID` from the response.
+- Send that header on follow-up requests to continue the same OpenHands conversation.
+
+When reusing a conversation, send the newest user turn in `messages`. The server-side OpenHands conversation owns the previous agent state, tool activity, and workspace context.
+
+## Current Limitations
+
+- Only non-streaming Chat Completions requests are supported. Requests with `stream: true` return `400` until streaming support is added.
+- The response contains the final assistant text only. Internal OpenHands tool activity is not exposed as OpenAI tool calls.
+- OpenAI request fields that are not needed by the gateway are ignored or rejected intentionally by the server implementation.
 
 ## Ready-to-run example
 

@@ -3,7 +3,7 @@
 description: Run agents on remote servers with isolated workspaces for production deployments.
 ---

 Remote Agent Servers package the Software Agent SDK into containers you can deploy anywhere (Kubernetes, VMs, on‑prem, any cloud) with strong isolation. The remote path uses the exact same SDK API as local—switching is just changing the workspace argument; your Conversation code stays the same.


 For example, switching from a local workspace to a Docker‑based remote agent server:
@@ -38,11 +38,12 @@
 ## What is a Remote Agent Server?

 A Remote Agent Server is an HTTP/WebSocket server that:
 - **Package the Software Agent SDK into containers** and deploy on your own infrastructure (Kubernetes, VMs, on-prem, or cloud)
 - **Runs agents** on dedicated infrastructure
 - **Manages workspaces** (Docker containers or remote sandboxes)
 - **Streams events** to clients via WebSocket
 - **Handles command and file operations** (execute command, upload, download), check [base class](https://github.com/OpenHands/software-agent-sdk/blob/main/openhands-sdk/openhands/sdk/workspace/base.py) for more details
+- **Accepts OpenAI-compatible Chat Completions requests** through the [OpenAI-compatible endpoint](/sdk/guides/agent-server/openai-gateway)
 - **Provides isolation** between different agent executions
 
 Think of it as the "backend" for your agent, while your Python code acts as the "frontend" client.
@@ -111,7 +112,7 @@
 - Waits for it to be ready.
 - Shares the server URL with the SDK client.

 You don’t need to manage this manually—the workspace context handles startup and teardown automatically.

 ### 3. Event Streaming → *(Bidirectional WebSocket)*

@@ -141,7 +142,7 @@
 print(result.stdout)
 ```

 These commands are proxied through the agent server, whether it’s a Docker container or a remote VM, keeping your client code environment-agnostic.

 ### Summary

@@ -159,6 +160,7 @@
 - **[Local Agent Server](/sdk/guides/agent-server/local-server)** - Run agent server in the same process
 - **[Docker Sandboxed Server](/sdk/guides/agent-server/docker-sandbox)** - Run agent server in isolated Docker containers
 - **[API Sandboxed Server](/sdk/guides/agent-server/api-sandbox)** - Connect to hosted agent server via API
+- **[OpenAI-Compatible Endpoint](/sdk/guides/agent-server/openai-gateway)** - Access an OpenHands agent from OpenAI-compatible clients
 
 For architectural details:
 - **[Agent Server Package Architecture](/sdk/arch/agent-server)** - Remote execution architecture and deployment
@@ -12,14 +12,15 @@
 - One-off tasks, like building a README for your repo
 - Routine maintenance tasks, like updating dependencies
 - Major tasks that involve multiple agents, like refactors and rewrites
+- OpenAI-compatible access to an OpenHands agent from chat UIs, IDEs, voice platforms, and other clients
 
 You can even use the SDK to build new developer experiences—it’s the engine behind the [OpenHands CLI](/openhands/usage/cli/quick-start) and [OpenHands Cloud](/openhands/usage/cloud/openhands-cloud).
 
 Get started with some examples or keep reading to learn more.
 
 ## Features
 
-<Columns cols={3}>
+<Columns cols={4}>
   <Card title="Single Python API" icon="python">
     A unified Python API that enables you to run agents locally or in the cloud, define custom agent behaviors, and create custom tools.
   </Card>
@@ -29,32 +30,39 @@
   <Card title="REST-based Agent Server" icon="server">
     A production-ready server that runs agents anywhere, including Docker and Kubernetes, while connecting seamlessly to the Python API.
   </Card>
+  <Card
+    title="OpenAI-Compatible Endpoint"
+    icon="plug"
+    href="/sdk/guides/agent-server/openai-gateway"
+  >
+    Access the OpenHands agent via an OpenAI-compatible endpoint for chat UIs, IDEs, voice platforms, and other OpenAI-style clients.
+  </Card>
 </Columns>
 
 ## Why OpenHands Software Agent SDK?

 ### Emphasis on coding

 While other agent SDKs (e.g. [LangChain](https://python.langchain.com/docs/tutorials/agents/)) are focused on more general use cases, like delivering chat-based support or automating back-office tasks, OpenHands is purpose-built for software engineering.

 While some folks do use OpenHands to solve more general tasks (code is a powerful tool!), most of us use OpenHands to work with code.

 ### State-of-the-Art Performance

 OpenHands is a top performer across a wide variety of benchmarks, including SWE-bench, SWT-bench, and multi-SWE-bench. The SDK includes a number of state-of-the-art agentic features developed by our research team, including:

 - Task planning and decomposition
 - Automatic context compression
 - Security analysis
 - Strong agent-computer interfaces

 OpenHands has attracted researchers from a wide variety of academic institutions, and is [becoming the preferred harness](https://x.com/Alibaba_Qwen/status/1947766835023335516) for evaluating LLMs on coding tasks.

 ### Free and Open Source

 OpenHands is also the leading open source framework for coding agents. It’s MIT-licensed, and can work with any LLM—including big proprietary LLMs like Claude and OpenAI, as well as open source LLMs like Qwen and Devstral.

 Other SDKs (e.g. [Claude Code](https://github.com/anthropics/claude-agent-sdk-python)) are proprietary and lock you into a particular model. Given how quickly models are evolving, it’s best to stay model-agnostic!

 ## Get Started

@@ -97,7 +105,7 @@
    title="Remote Execution"
    href="/sdk/guides/agent-server/local-server"
  >
    Run agents on remote servers with Docker sandboxing.
  </Card>
  <Card
    title="GitHub Workflows"