feat: GLM-5.2 and DeepSeek-v4-pro support in Codex Desktop#12
Conversation
Add GLM 5.2 1m and deepseek-v4-pro to Codex via a Responses->Chat (deepseek) and Responses->Anthropic (glm) conversion layer in the existing local proxy, per-request routed by model. Aligned with the opencode-desktop-support 3-bucket routing already on origin/master. Co-Authored-By: Claude <noreply@anthropic.com>
12 TDD tasks: protoconv catalog + two converters (chat for deepseek, anthropic for glm) with contract-first streaming, proxy routing integration, codex config regression, and set-model on both CLIs. Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
…id on deltas, preserve array tool outputs
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
…ing blocks, test unknown-model pass-through Co-Authored-By: Claude <noreply@anthropic.com>
…ed by /v1/messages) Co-Authored-By: Claude <noreply@anthropic.com>
…ires it) Co-Authored-By: Claude <noreply@anthropic.com>
Proves WriteAnthropicStreamAsResponses correctly emits output_text.delta for real GLM /v1/messages SSE (converter is correct; live-serveConverted GLM-stream delta loss is a separate proxy-path issue). Co-Authored-By: Claude <noreply@anthropic.com>
The Responses API permits 'input' to be either an array of items or a bare string for a single user message. The Anthropic mapper only handled the array case; a string fell through with an empty messages list, which the gateway rejected (1214 messages 参数非法) — but only on streaming requests, which is what surfaced the bug. Mirror the bare-string handling already in the Chat Completions mapper. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Cross-host Linux packaging boxes don't have powershell.exe, so the script failed hard before. The Authenticode signature is re-verified on the Windows target at install time by ensure-codex-desktop.ps1, so degrade to a WARNING here rather than failing the build. The MZ and size sanity checks above still run unconditionally. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ed model The launcher writes ~/.codex/config.toml every time Codex Desktop starts, via codex.ModelserverProxySettings + UpdateConfig. ModelserverProxySettings inherited Model="gpt-5.5" from ModelserverSettings(), so every restart clobbered the user's set-model choice back to gpt-5.5 — making 'agentctl set-model glm-5.2' useless: it survived until the next launch. Split the two concerns at the Settings level: - ModelserverProxySettings now leaves Model empty (it is a provider-only update; the bearer token and base URL are the only things it must own). - UpdateConfig, when given Model="", preserves whatever model is already in the file. It only seeds the default (gpt-5.5) when the field is absent (first write). SetModel is unchanged (it explicitly sets Model), so user choice still wins via either CLI or the merge. Regression test covers the launcher-restart cycle end-to-end. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ed/completed
Codex's Responses SSE parser rejects streams whose response.completed event
omits the response id with:
stream disconnected before completion: failed to parse ResponseCompleted:
missing field `id`
Both converters were emitting:
data: {"response":{"status":"completed"},"type":"response.completed"}
Now they capture the upstream identity (Anthropic: message_start.message.id/model;
Chat: each chunk's top-level id/model — first non-empty wins) and include it in
both response.created and response.completed, alongside status (in_progress /
completed). response.created is emitted lazily on first knowledge of the id,
with a fallback so an empty stream still produces a well-formed pair.
Regression tests added on both sides assert id+model are present in both events.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codex's Responses parser uses 'if let Ok(item) = ...' on output_item.added
and output_item.done payloads. Items that fail to deserialize as
ResponseItem are silently dropped — no error, no UI text. We were sending
the minimum shape:
{ type: 'message', id: 'msg_1' }
{ type: 'function_call', id: 'c1', call_id: 'c1', name: 'run' }
Neither parses: ResponseItem::Message requires {role, content}, and
ResponseItem::FunctionCall requires {name, arguments, call_id}. The
streaming response.completed event landed fine (id was correct), so Codex
saw a 'completed' turn with zero output items — hence 'no result' even
when our SSE looked syntactically OK on the wire.
Accumulate text and tool-arg deltas across the stream and emit fully
populated items at close time, on both the Anthropic and Chat sides:
output_item.added: { type:'message', id, role:'assistant', content:[] }
output_item.done: { ..., content:[{type:'output_text', text:'<accum>'}] }
output_item.added: { type:'function_call', id, call_id, name, arguments:'' }
output_item.done: { ..., arguments:'<accumulated json>' }
New tests on both converters parse every output_item.done frame and assert
role/content for messages and name/arguments/call_id for tool calls. Match
the schema in openai/codex codex-rs/protocol/src/models.rs.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codex Desktop's built-in model picker has not supported custom providers since 2026/02 (openai/codex #10867, #15138, #15364, #19694, #22160, #29156 — all open, no PR). To unblock users, expose model selection in our own control surface. End-to-end pieces: - protoconv.Route gains DisplayName; new Catalog() returns a copy of the full table for UI consumption. KnownModels()/LookupRoute unchanged. - codex.CurrentModel(path) reads the model field from config.toml (pure read; returns gpt-5.5 default when absent). - console.State exposes current_model + available_models, populated from the catalog + CurrentModel(deps.CodexConfigFile). - console.Controller.SetCodexModel validates against the catalog and calls codex.SetModel (which already preserves bearer token, base_url, etc.). - ui.ConsoleController gains SetCodexModel; server adds POST /api/console/model behind the existing trusted-mutation token check. - launcher passes CodexConfigFile into console.Deps. Frontend (Vue 3 + Element Plus): - api.ts: ConsoleState gets current_model + available_models; setConsoleModel() POSTs to /api/console/model with the token header. - Dashboard.vue: new card under the connection grid (codex_desktop only), el-radio-group of available models, on change calls setConsoleModel + refresh + ElMessage toast: "已切换到 X. 新建 Codex 对话生效(旧对话保持原模型)". Vitest covers picker rendering + click triggers setConsoleModel + refresh. Vite dist is committed (it's allow-listed in internal/ui/assets/.gitignore and embedded by //go:embed in internal/ui/server.go). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codex sends prompt-author instructions with role="developer" (a Responses
API role). DeepSeek's Chat Completions endpoint rejects it with:
developer is not one of ['system', 'assistant', 'user', 'tool', 'function']
- 'messages.[0].role'
The Anthropic converter already collapses {system, developer} message items
into a merged top-level system parts list; mirror that on the Chat side:
collect them into systemParts, prepend a single 'system' message before
the rest. Other roles fall through unchanged.
Regression test covers instructions + developer + inline system → single
merged system message, no developer leak.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…an EOF After receiving [DONE] the converter broke out of the scan loop, leaving trailing bytes (blank lines, usage chunks DeepSeek sometimes emits after [DONE]) unread. The modelproxy's defer resp.Body.Close() then closed the body short of EOF, which net/http handles by RST-ing the underlying TCP connection rather than returning it to the keep-alive pool. The upstream gateway (code.ai.cs.ac.cn) interprets that RST as an abnormal client disconnect: the model run's turn state never transitions from in_progress, so the modelserver dashboard kept showing 'processing' indefinitely — even though Codex Desktop had already rendered the finished reply. Drain to EOF with io.Copy(io.Discard, r) before break. The Anthropic converter is unaffected (its loop runs to EOF naturally; the gateway closes the connection after message_stop). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When the user hits 'open frontend' (or the same path on opencode/vscode), launcher always calls codex.UpdateConfig to refresh model_provider / base_url / bearer_token. UpdateConfig writes atomically: write tmp, rename onto config.toml. If Codex Desktop is already running, it holds a read handle on config.toml; Windows then rejects the rename with ERROR_ACCESS_DENIED, and the launcher returned 500 to the UI even though the launch itself would have succeeded. Two-layer fix: 1. codex/config.go: writeConfigFile now retries rename with short backoffs (~250ms total). Rides out brief reader-hold races without any caller-side knowledge. 2. cmd/launcher/main.go: the three launch paths (Codex Desktop, OpenCode Desktop, VSCode) now log+continue on UpdateConfig failure instead of returning the error. The running Codex wouldn't reload the file anyway, so the rewrite is best-effort once Codex is up. First-time launch (Codex not running, file unlocked) and agentctl/agentserver set-model writes (user-initiated, Codex picks up on next launch) are unaffected — they still see hard errors. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…picker Codex Desktop's picker filters out anything outside its bundled OpenAI catalog. openai/codex #15138, #19694, #22160, #29156 — all open since 2026/02. The official escape hatch is the (undocumented but live) model_catalog_json field, which fully replaces the bundled catalog with whatever JSON file it points at. We now emit that file on every Codex launch: - internal/codexdesktop/catalog.go (+model_template.json embed) Renders agentserver_model_catalog.json by cloning the bundled gpt-5.5 model row (extracted from openai/codex@main models.json) and overriding only slug+display_name per protoconv.Catalog() entry. Cloning the full row ensures every required ModelInfo field stays present so Codex's strict deserialize succeeds. Atomic write. - internal/codex Settings: new ModelCatalogJSON field; UpdateConfig writes it as top-level model_catalog_json. Empty leaves existing value untouched. - internal/paths: CodexModelCatalogFile = ~/.codex/agentserver_model_catalog.json - cmd/launcher: launchCompletedCodexDesktop writes the catalog then points config at it. Both steps tolerate failure (logged) — picker loses the named affordance but model selection from our own console still works and Codex falls back to its bundled catalog. - Dashboard subtitle nudges users that Codex's own picker now works too. Refresh model_template.json when Codex adds required ModelInfo fields across releases (otherwise catalog load fails and picker silently falls back). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…n model picker" This reverts commit c20df28.
Update — verified end-to-end on box
|
| model | non-stream | stream | Codex Desktop UI |
|---|---|---|---|
gpt-5.5 |
✅ pass-through | — | ✅ |
deepseek-v4-pro |
✅ Responses↔Chat | ✅ full SSE w/ deltas | ✅ |
glm-5.2 |
✅ Responses↔Anthropic | ✅ full SSE w/ deltas | ✅ |
Switching: agentctl set-model <name> (CLI) or the new selector card in the 星池指挥官 dashboard. New conversations pick up the change; old threads stay on the model they were created with (Codex Desktop persists per-thread model in state_5.sqlite).
Bugs fixed in this round
fix(protoconv): handle bare-string input in Anthropic request mapper— gateway returned1214 messages 参数非法when Codex sentinputas a string instead of an array.fix(codex): launcher's provider-only update must preserve user-selected model— launcher rewrotemodel = "gpt-5.5"on every Codex Desktop start, undoing user'sset-modelchoice.fix(protoconv): carry upstream id+model into streaming response.created/completed— Codex parser fails withmissing field idwithout these.fix(protoconv): emit ResponseItem-shaped output_item events for Codex— message items neededrole+content, function_call neededarguments. Without these, Codex'sif let Ok(item) = ...silently dropped every item — stream looked complete, UI got no text.fix(protoconv): merge developer role into system for Chat Completions— DeepSeek rejected thedeveloperrole; merge into a single leadingsystemmessage (mirrors the Anthropic mapper).fix(protoconv): drain Chat SSE body after [DONE] so upstream sees clean EOF— leaving trailing bytes unread causedhttp.Clientto RST the TCP connection; the upstream then never marked the turn complete.fix(launcher): tolerate locked config.toml when launching Codex Desktop— Codex Desktop holdsconfig.tomlopen while running;os.RenamereturnedERROR_ACCESS_DENIEDand the launch path 500'd. UpdateConfig now retries rename, and the launcher logs+continues on failure (the running Codex wouldn't reload the file anyway).feat(console-ui): model selector in 星池指挥官 dashboard— Vue 3 + Element Plus card under the connection grid, fed byprotoconv.Catalog(). Codex Desktop's own picker is broken for custom providers (openai/codex #15138 etc., all still open), so we expose model switching in our own surface.
What we didn't ship
Tried to make Codex Desktop's own picker show GLM/DeepSeek via model_catalog_json (commit c20df28, reverted in 1187521). Verified directly: the catalog file was loaded correctly and the gateway's /v1/models returned all 31 models, but Desktop's frontend has a hardcoded slug allow-list that filters everything to OpenAI-prefixed names regardless. This is the same family of bugs as openai/codex #15138 / #19694 / #29156 — it can't be fixed from the config side. Reverted to avoid feeding incorrect metadata (cloned from gpt-5.5's row) to Codex's core, which could trigger wrong reasoning-level / context-window decisions.
Known limitation (not ours)
The modelserver gateway dashboard shows DeepSeek conversations as permanently processing with 0 bytes and a - ID. Verified the daemon receives the full SSE stream and forwards it correctly to Codex; gpt-5.5 (openai_responses) and glm-5.2 (anthropic_messages) rows both transition to success with full stats. This is a missing statistics hook on the openai_chat_completions provider type in the gateway itself — to be reported separately to the modelserver team.
P1 (real, blocking for direct-config users): SetModel previously routed through UpdateConfig with a partial Settings, which deleted env_key when Settings.EnvKey == "". A valid direct-provider config (env_key = "OPENAI_API_KEY", no bearer token) silently became a proxy config (no env_key, legacy bearer token), breaking auth on the next Codex start. Rewrite SetModel as a self-contained one-field rewriter that touches only the top-level 'model' key, leaving every provider field untouched. First-call-on-missing-file still seeds the proxy defaults so headless agentserver/agentctl set-model on a fresh install keeps working. Regression covers both direct-config preservation and fresh-install seeding. P2 (real, latent): chat.go / anthropic.go forwarded root["stream"] as-is, yielding "stream": null on the wire when Codex omitted the field. Codex Desktop always sets it so the live path didn't fire, but other Codex clients (exec, review, app-server probes) may not. Forward only when present as a bool. P3 (real): execVSCode tolerated codex.UpdateConfig failures with a log+continue, copy-pasted from the Codex Desktop / OpenCode launch paths. The rationale there is that the Desktop app holds config.toml open and the rename loses the race; VS Code does NOT hold the file open, so the failure is real and must propagate (the Codex extension reads config.toml at session creation). P4 (real, correctness): Chat converter dropped parallel_tool_calls, tool_choice, and function tools' strict flag. Codex 0.142 sends parallel_tool_calls per model_info.supports_parallel_tool_calls and may set tool_choice; dropping either lets the upstream silently disregard client intent. Forward both, and preserve strict in the tool list. Anthropic converter does NOT forward these on purpose: parallel_tool_calls is not part of the Anthropic Messages API, and tool_choice has a different object shape than Codex's string — explicit comments document the omissions. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Addressed the four review findings in
All four bugs verified by reading the actual code paths, not taken on faith. Tests + full suite green. |
…ew P5)
Verified the report: chat.go was forwarding root["reasoning"] verbatim
when present (even if null). The reviewer is correct that this is the
same family of bug as the stream-null fix, but it has a second
correctness layer beyond null-handling:
reasoning is a Responses-API field. Chat Completions has no top-level
reasoning. DeepSeek uses thinking: {type, reasoning_effort}; Anthropic
uses thinking: {type, budget_tokens}; OpenAI Chat does not support
reasoning at all. So even when Codex sends a non-null reasoning
object, forwarding it raw produces a meaningless (or rejected) field.
Drop the forward entirely. Per-upstream mapping (Responses reasoning ->
DeepSeek thinking / Anthropic thinking) is a deferred feature; doing
nothing is strictly safer than forwarding the wrong-shape field.
The Anthropic converter never forwarded the top-level reasoning object
(it only handled inline 'reasoning' input items, which it drops), so
no change there.
Regression: three cases — reasoning as object, reasoning as null,
reasoning absent — all must produce a Chat body without a reasoning
key.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Fixed in |
After PR #12, the local proxy is no longer just a token-refresh forwarder — it does per-model protocol conversion (gpt-5.5 passthrough, deepseek-v4-pro -> Chat Completions, glm-5.2 -> Anthropic Messages) and ships a model-picker affordance in the console UI plus agentctl/agentserver set-model. Update both user-facing docs so the positioning matches the actual product: - README.md: rename the bullet to '共享模型访问路径 + 多模型路由', list the per-model routing and the two ways to switch. - 项目描述.md: replace the 'forwards to .../v1' note with a routing table and document the switch UX plus the new-conversation-only semantics (old threads keep their creation-time model — Codex Desktop persists model per thread in state_5.sqlite). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
What
Lets Codex Desktop (and the Codex CLI / Linux headless
agentserverthat share~/.codex/config.toml) driveglm-5.2anddeepseek-v4-proin addition togpt-5.5, all through the existing local proxy + ModelServer account.Why
Codex dropped Chat Completions wire support in v0.81.0 and only speaks the Responses API.
gpt-5.5works because the gateway speaks Responses for it; GLM and DeepSeek don't. A protocol-conversion layer in the local proxy is the only way to keep one upstream + one key + zero per-app config.How
internal/protoconv(new): pure request-mapper + streaming adapter for Responses ⇄ Chat Completions and Responses ⇄ Anthropic Messages. Unit-tested in isolation, including a real captured GLM SSE regression fixture.internal/modelproxy: routes per-request by readingmodelfrom the request body.gpt-5.5passes through unchanged; converted models go throughprotoconv. Unknown models pass through with a log, preservinggpt-5.5parity as the safe default.internal/codex+ CLIs:agentctl set-model <name>andagentserver set-model <name>rewrite only themodelfield inconfig.tomlvia the existing merge+backup logic, preserving the per-user proxy token.Catalog (table-driven, names == upstream names):
gpt-5.5/v1/responsesglm-5.2/v1/messages(Anthropic)deepseek-v4-pro/v1/chat/completions(Chat)Real-machine verification (box
9.0.16.110)gpt-5.5deepseek-v4-prooutput_text.deltaglm-5.2output_text.deltaNotes
scripts/windows-package-common.shdegrades the Authenticode check to a warning on Linux packaging hosts (nopowershell.exe); the signature is re-verified at install time on the Windows target.🤖 Generated with Claude Code