feat: GLM-5.2 and DeepSeek-v4-pro support in Codex Desktop by yzs15 · Pull Request #12 · agentserver/app

yzs15 · 2026-06-22T09:45:12Z

What

Lets Codex Desktop (and the Codex CLI / Linux headless agentserver that share ~/.codex/config.toml) drive glm-5.2 and deepseek-v4-pro in addition to gpt-5.5, all through the existing local proxy + ModelServer account.

Why

Codex dropped Chat Completions wire support in v0.81.0 and only speaks the Responses API. gpt-5.5 works because the gateway speaks Responses for it; GLM and DeepSeek don't. A protocol-conversion layer in the local proxy is the only way to keep one upstream + one key + zero per-app config.

How

internal/protoconv (new): pure request-mapper + streaming adapter for Responses ⇄ Chat Completions and Responses ⇄ Anthropic Messages. Unit-tested in isolation, including a real captured GLM SSE regression fixture.
internal/modelproxy: routes per-request by reading model from the request body. gpt-5.5 passes through unchanged; converted models go through protoconv. Unknown models pass through with a log, preserving gpt-5.5 parity as the safe default.
internal/codex + CLIs: agentctl set-model <name> and agentserver set-model <name> rewrite only the model field in config.toml via the existing merge+backup logic, preserving the per-user proxy token.

Catalog (table-driven, names == upstream names):

model	upstream path	behavior
`gpt-5.5`	`/v1/responses`	pass-through
`glm-5.2`	`/v1/messages` (Anthropic)	converted
`deepseek-v4-pro`	`/v1/chat/completions` (Chat)	converted

Real-machine verification (box `9.0.16.110`)

model	non-stream	stream
`gpt-5.5`	✅ 200 (pass-through)	—
`deepseek-v4-pro`	✅ 200 (Responses⇄Chat)	✅ full SSE with `output_text.delta`
`glm-5.2`	✅ 200 (Responses⇄Anthropic)	✅ full SSE with `output_text.delta`

Notes

v1 covers text + tool calls + streaming. File/audio/reasoning parity is a designed-for follow-up — the types admit them; converter is structured to extend.
scripts/windows-package-common.sh degrades the Authenticode check to a warning on Linux packaging hosts (no powershell.exe); the signature is re-verified at install time on the Windows target.

🤖 Generated with Claude Code

Add GLM 5.2 1m and deepseek-v4-pro to Codex via a Responses->Chat (deepseek) and Responses->Anthropic (glm) conversion layer in the existing local proxy, per-request routed by model. Aligned with the opencode-desktop-support 3-bucket routing already on origin/master. Co-Authored-By: Claude <noreply@anthropic.com>

12 TDD tasks: protoconv catalog + two converters (chat for deepseek, anthropic for glm) with contract-first streaming, proxy routing integration, codex config regression, and set-model on both CLIs. Co-Authored-By: Claude <noreply@anthropic.com>

Co-Authored-By: Claude <noreply@anthropic.com>

…id on deltas, preserve array tool outputs

Co-Authored-By: Claude <noreply@anthropic.com>

…ing blocks, test unknown-model pass-through Co-Authored-By: Claude <noreply@anthropic.com>

…ed by /v1/messages) Co-Authored-By: Claude <noreply@anthropic.com>

…ires it) Co-Authored-By: Claude <noreply@anthropic.com>

Proves WriteAnthropicStreamAsResponses correctly emits output_text.delta for real GLM /v1/messages SSE (converter is correct; live-serveConverted GLM-stream delta loss is a separate proxy-path issue). Co-Authored-By: Claude <noreply@anthropic.com>

The Responses API permits 'input' to be either an array of items or a bare string for a single user message. The Anthropic mapper only handled the array case; a string fell through with an empty messages list, which the gateway rejected (1214 messages 参数非法) — but only on streaming requests, which is what surfaced the bug. Mirror the bare-string handling already in the Chat Completions mapper. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Cross-host Linux packaging boxes don't have powershell.exe, so the script failed hard before. The Authenticode signature is re-verified on the Windows target at install time by ensure-codex-desktop.ps1, so degrade to a WARNING here rather than failing the build. The MZ and size sanity checks above still run unconditionally. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ed model The launcher writes ~/.codex/config.toml every time Codex Desktop starts, via codex.ModelserverProxySettings + UpdateConfig. ModelserverProxySettings inherited Model="gpt-5.5" from ModelserverSettings(), so every restart clobbered the user's set-model choice back to gpt-5.5 — making 'agentctl set-model glm-5.2' useless: it survived until the next launch. Split the two concerns at the Settings level: - ModelserverProxySettings now leaves Model empty (it is a provider-only update; the bearer token and base URL are the only things it must own). - UpdateConfig, when given Model="", preserves whatever model is already in the file. It only seeds the default (gpt-5.5) when the field is absent (first write). SetModel is unchanged (it explicitly sets Model), so user choice still wins via either CLI or the merge. Regression test covers the launcher-restart cycle end-to-end. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ed/completed Codex's Responses SSE parser rejects streams whose response.completed event omits the response id with: stream disconnected before completion: failed to parse ResponseCompleted: missing field `id` Both converters were emitting: data: {"response":{"status":"completed"},"type":"response.completed"} Now they capture the upstream identity (Anthropic: message_start.message.id/model; Chat: each chunk's top-level id/model — first non-empty wins) and include it in both response.created and response.completed, alongside status (in_progress / completed). response.created is emitted lazily on first knowledge of the id, with a fallback so an empty stream still produces a well-formed pair. Regression tests added on both sides assert id+model are present in both events. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Codex's Responses parser uses 'if let Ok(item) = ...' on output_item.added and output_item.done payloads. Items that fail to deserialize as ResponseItem are silently dropped — no error, no UI text. We were sending the minimum shape: { type: 'message', id: 'msg_1' } { type: 'function_call', id: 'c1', call_id: 'c1', name: 'run' } Neither parses: ResponseItem::Message requires {role, content}, and ResponseItem::FunctionCall requires {name, arguments, call_id}. The streaming response.completed event landed fine (id was correct), so Codex saw a 'completed' turn with zero output items — hence 'no result' even when our SSE looked syntactically OK on the wire. Accumulate text and tool-arg deltas across the stream and emit fully populated items at close time, on both the Anthropic and Chat sides: output_item.added: { type:'message', id, role:'assistant', content:[] } output_item.done: { ..., content:[{type:'output_text', text:'<accum>'}] } output_item.added: { type:'function_call', id, call_id, name, arguments:'' } output_item.done: { ..., arguments:'<accumulated json>' } New tests on both converters parse every output_item.done frame and assert role/content for messages and name/arguments/call_id for tool calls. Match the schema in openai/codex codex-rs/protocol/src/models.rs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Codex Desktop's built-in model picker has not supported custom providers since 2026/02 (openai/codex #10867, #15138, #15364, #19694, #22160, #29156 — all open, no PR). To unblock users, expose model selection in our own control surface. End-to-end pieces: - protoconv.Route gains DisplayName; new Catalog() returns a copy of the full table for UI consumption. KnownModels()/LookupRoute unchanged. - codex.CurrentModel(path) reads the model field from config.toml (pure read; returns gpt-5.5 default when absent). - console.State exposes current_model + available_models, populated from the catalog + CurrentModel(deps.CodexConfigFile). - console.Controller.SetCodexModel validates against the catalog and calls codex.SetModel (which already preserves bearer token, base_url, etc.). - ui.ConsoleController gains SetCodexModel; server adds POST /api/console/model behind the existing trusted-mutation token check. - launcher passes CodexConfigFile into console.Deps. Frontend (Vue 3 + Element Plus): - api.ts: ConsoleState gets current_model + available_models; setConsoleModel() POSTs to /api/console/model with the token header. - Dashboard.vue: new card under the connection grid (codex_desktop only), el-radio-group of available models, on change calls setConsoleModel + refresh + ElMessage toast: "已切换到 X. 新建 Codex 对话生效（旧对话保持原模型）". Vitest covers picker rendering + click triggers setConsoleModel + refresh. Vite dist is committed (it's allow-listed in internal/ui/assets/.gitignore and embedded by //go:embed in internal/ui/server.go). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Codex sends prompt-author instructions with role="developer" (a Responses API role). DeepSeek's Chat Completions endpoint rejects it with: developer is not one of ['system', 'assistant', 'user', 'tool', 'function'] - 'messages.[0].role' The Anthropic converter already collapses {system, developer} message items into a merged top-level system parts list; mirror that on the Chat side: collect them into systemParts, prepend a single 'system' message before the rest. Other roles fall through unchanged. Regression test covers instructions + developer + inline system → single merged system message, no developer leak. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…an EOF After receiving [DONE] the converter broke out of the scan loop, leaving trailing bytes (blank lines, usage chunks DeepSeek sometimes emits after [DONE]) unread. The modelproxy's defer resp.Body.Close() then closed the body short of EOF, which net/http handles by RST-ing the underlying TCP connection rather than returning it to the keep-alive pool. The upstream gateway (code.ai.cs.ac.cn) interprets that RST as an abnormal client disconnect: the model run's turn state never transitions from in_progress, so the modelserver dashboard kept showing 'processing' indefinitely — even though Codex Desktop had already rendered the finished reply. Drain to EOF with io.Copy(io.Discard, r) before break. The Anthropic converter is unaffected (its loop runs to EOF naturally; the gateway closes the connection after message_stop). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

When the user hits 'open frontend' (or the same path on opencode/vscode), launcher always calls codex.UpdateConfig to refresh model_provider / base_url / bearer_token. UpdateConfig writes atomically: write tmp, rename onto config.toml. If Codex Desktop is already running, it holds a read handle on config.toml; Windows then rejects the rename with ERROR_ACCESS_DENIED, and the launcher returned 500 to the UI even though the launch itself would have succeeded. Two-layer fix: 1. codex/config.go: writeConfigFile now retries rename with short backoffs (~250ms total). Rides out brief reader-hold races without any caller-side knowledge. 2. cmd/launcher/main.go: the three launch paths (Codex Desktop, OpenCode Desktop, VSCode) now log+continue on UpdateConfig failure instead of returning the error. The running Codex wouldn't reload the file anyway, so the rewrite is best-effort once Codex is up. First-time launch (Codex not running, file unlocked) and agentctl/agentserver set-model writes (user-initiated, Codex picks up on next launch) are unaffected — they still see hard errors. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…picker Codex Desktop's picker filters out anything outside its bundled OpenAI catalog. openai/codex #15138, #19694, #22160, #29156 — all open since 2026/02. The official escape hatch is the (undocumented but live) model_catalog_json field, which fully replaces the bundled catalog with whatever JSON file it points at. We now emit that file on every Codex launch: - internal/codexdesktop/catalog.go (+model_template.json embed) Renders agentserver_model_catalog.json by cloning the bundled gpt-5.5 model row (extracted from openai/codex@main models.json) and overriding only slug+display_name per protoconv.Catalog() entry. Cloning the full row ensures every required ModelInfo field stays present so Codex's strict deserialize succeeds. Atomic write. - internal/codex Settings: new ModelCatalogJSON field; UpdateConfig writes it as top-level model_catalog_json. Empty leaves existing value untouched. - internal/paths: CodexModelCatalogFile = ~/.codex/agentserver_model_catalog.json - cmd/launcher: launchCompletedCodexDesktop writes the catalog then points config at it. Both steps tolerate failure (logged) — picker loses the named affordance but model selection from our own console still works and Codex falls back to its bundled catalog. - Dashboard subtitle nudges users that Codex's own picker now works too. Refresh model_template.json when Codex adds required ModelInfo fields across releases (otherwise catalog load fails and picker silently falls back). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…n model picker" This reverts commit c20df28.

yzs15 · 2026-06-23T03:22:37Z

Update — verified end-to-end on box `9.0.16.110`

Final state after on-box testing + several iterations of bug fixes:

Verified working

model	non-stream	stream	Codex Desktop UI
`gpt-5.5`	✅ pass-through	—	✅
`deepseek-v4-pro`	✅ Responses↔Chat	✅ full SSE w/ deltas	✅
`glm-5.2`	✅ Responses↔Anthropic	✅ full SSE w/ deltas	✅

Switching: agentctl set-model <name> (CLI) or the new selector card in the 星池指挥官 dashboard. New conversations pick up the change; old threads stay on the model they were created with (Codex Desktop persists per-thread model in state_5.sqlite).

Bugs fixed in this round

fix(protoconv): handle bare-string input in Anthropic request mapper — gateway returned 1214 messages 参数非法 when Codex sent input as a string instead of an array.
fix(codex): launcher's provider-only update must preserve user-selected model — launcher rewrote model = "gpt-5.5" on every Codex Desktop start, undoing user's set-model choice.
fix(protoconv): carry upstream id+model into streaming response.created/completed — Codex parser fails with missing field id without these.
fix(protoconv): emit ResponseItem-shaped output_item events for Codex — message items needed role + content, function_call needed arguments. Without these, Codex's if let Ok(item) = ... silently dropped every item — stream looked complete, UI got no text.
fix(protoconv): merge developer role into system for Chat Completions — DeepSeek rejected the developer role; merge into a single leading system message (mirrors the Anthropic mapper).
fix(protoconv): drain Chat SSE body after [DONE] so upstream sees clean EOF — leaving trailing bytes unread caused http.Client to RST the TCP connection; the upstream then never marked the turn complete.
fix(launcher): tolerate locked config.toml when launching Codex Desktop — Codex Desktop holds config.toml open while running; os.Rename returned ERROR_ACCESS_DENIED and the launch path 500'd. UpdateConfig now retries rename, and the launcher logs+continues on failure (the running Codex wouldn't reload the file anyway).
feat(console-ui): model selector in 星池指挥官 dashboard — Vue 3 + Element Plus card under the connection grid, fed by protoconv.Catalog(). Codex Desktop's own picker is broken for custom providers (openai/codex #15138 etc., all still open), so we expose model switching in our own surface.

What we didn't ship

Tried to make Codex Desktop's own picker show GLM/DeepSeek via model_catalog_json (commit c20df28, reverted in 1187521). Verified directly: the catalog file was loaded correctly and the gateway's /v1/models returned all 31 models, but Desktop's frontend has a hardcoded slug allow-list that filters everything to OpenAI-prefixed names regardless. This is the same family of bugs as openai/codex #15138 / #19694 / #29156 — it can't be fixed from the config side. Reverted to avoid feeding incorrect metadata (cloned from gpt-5.5's row) to Codex's core, which could trigger wrong reasoning-level / context-window decisions.

Known limitation (not ours)

The modelserver gateway dashboard shows DeepSeek conversations as permanently processing with 0 bytes and a - ID. Verified the daemon receives the full SSE stream and forwards it correctly to Codex; gpt-5.5 (openai_responses) and glm-5.2 (anthropic_messages) rows both transition to success with full stats. This is a missing statistics hook on the openai_chat_completions provider type in the gateway itself — to be reported separately to the modelserver team.

P1 (real, blocking for direct-config users): SetModel previously routed through UpdateConfig with a partial Settings, which deleted env_key when Settings.EnvKey == "". A valid direct-provider config (env_key = "OPENAI_API_KEY", no bearer token) silently became a proxy config (no env_key, legacy bearer token), breaking auth on the next Codex start. Rewrite SetModel as a self-contained one-field rewriter that touches only the top-level 'model' key, leaving every provider field untouched. First-call-on-missing-file still seeds the proxy defaults so headless agentserver/agentctl set-model on a fresh install keeps working. Regression covers both direct-config preservation and fresh-install seeding. P2 (real, latent): chat.go / anthropic.go forwarded root["stream"] as-is, yielding "stream": null on the wire when Codex omitted the field. Codex Desktop always sets it so the live path didn't fire, but other Codex clients (exec, review, app-server probes) may not. Forward only when present as a bool. P3 (real): execVSCode tolerated codex.UpdateConfig failures with a log+continue, copy-pasted from the Codex Desktop / OpenCode launch paths. The rationale there is that the Desktop app holds config.toml open and the rename loses the race; VS Code does NOT hold the file open, so the failure is real and must propagate (the Codex extension reads config.toml at session creation). P4 (real, correctness): Chat converter dropped parallel_tool_calls, tool_choice, and function tools' strict flag. Codex 0.142 sends parallel_tool_calls per model_info.supports_parallel_tool_calls and may set tool_choice; dropping either lets the upstream silently disregard client intent. Forward both, and preserve strict in the tool list. Anthropic converter does NOT forward these on purpose: parallel_tool_calls is not part of the Anthropic Messages API, and tool_choice has a different object shape than Codex's string — explicit comments document the omissions. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

yzs15 · 2026-06-23T04:23:08Z

Addressed the four review findings in d60aa06:

#	Status	Fix
P1 SetModel corrupts direct configs	✅ confirmed real	`SetModel` rewritten as a one-field rewriter; no longer routes through `UpdateConfig`. Direct provider configs (`env_key`-based) survive intact; missing-file path still seeds proxy defaults so `agentctl set-model` on a fresh headless install still works. Two regression tests cover both axes.
P2 `"stream": null` for converted requests	✅ confirmed real (latent for Codex Desktop but trips other Codex clients)	Forward `stream` only when present as a bool, on both Chat and Anthropic sides. Regression test on both.
P3 VSCode launch hides config write failures	✅ confirmed real	`execVSCode` propagates `UpdateConfig` errors. The locked-file rationale only applies to Codex/OpenCode Desktop paths where the target app holds `config.toml` open; VS Code does not, so the failure must surface (the Codex extension reads the config at session creation).
P4 dropped `parallel_tool_calls` / `tool_choice` / `strict`	✅ confirmed real (correctness)	Chat: forward `parallel_tool_calls` + `tool_choice` when source sets them; preserve `strict` per-tool. Anthropic: deliberately not forwarded (`parallel_tool_calls` is not part of Anthropic Messages; `tool_choice` has a different object shape than Codex's string and there's no safe pass-through without a shape mapping). Both omissions are documented inline. Two regression tests on the Chat side.

All four bugs verified by reading the actual code paths, not taken on faith. Tests + full suite green.

…ew P5) Verified the report: chat.go was forwarding root["reasoning"] verbatim when present (even if null). The reviewer is correct that this is the same family of bug as the stream-null fix, but it has a second correctness layer beyond null-handling: reasoning is a Responses-API field. Chat Completions has no top-level reasoning. DeepSeek uses thinking: {type, reasoning_effort}; Anthropic uses thinking: {type, budget_tokens}; OpenAI Chat does not support reasoning at all. So even when Codex sends a non-null reasoning object, forwarding it raw produces a meaningless (or rejected) field. Drop the forward entirely. Per-upstream mapping (Responses reasoning -> DeepSeek thinking / Anthropic thinking) is a deferred feature; doing nothing is strictly safer than forwarding the wrong-shape field. The Anthropic converter never forwarded the top-level reasoning object (it only handled inline 'reasoning' input items, which it drops), so no change there. Regression: three cases — reasoning as object, reasoning as null, reasoning absent — all must produce a Chat body without a reasoning key. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

yzs15 · 2026-06-23T06:26:08Z

Fixed in 78d42d4. Confirmed real (verified against DeepSeek docs: top-level reasoning is not a valid Chat request field; DeepSeek uses thinking: {type, reasoning_effort}). Per-upstream mapping is a deferred feature; for now the Chat converter drops reasoning entirely. Anthropic converter was never affected (it doesn't forward the top-level field). Regression test covers reasoning-as-object, reasoning-as-null, and reasoning-absent — all must produce a Chat body without a reasoning key.

After PR #12, the local proxy is no longer just a token-refresh forwarder — it does per-model protocol conversion (gpt-5.5 passthrough, deepseek-v4-pro -> Chat Completions, glm-5.2 -> Anthropic Messages) and ships a model-picker affordance in the console UI plus agentctl/agentserver set-model. Update both user-facing docs so the positioning matches the actual product: - README.md: rename the bullet to '共享模型访问路径 + 多模型路由', list the per-model routing and the two ways to switch. - 项目描述.md: replace the 'forwards to .../v1' note with a routing table and document the switch UX plus the new-conversation-only semantics (old threads keep their creation-time model — Codex Desktop persists model per thread in state_5.sqlite). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Zishu Yu and others added 30 commits June 17, 2026 20:49

feat(protoconv): add model routing catalog

27689c3

feat(protoconv): Responses->Chat Completions request mapper

3cfd0d2

Co-Authored-By: Claude <noreply@anthropic.com>

feat(protoconv): Chat Completions->Responses response assembler

dd83b7f

Co-Authored-By: Claude <noreply@anthropic.com>

feat(protoconv): Chat Completions SSE -> Responses SSE

e33162c

feat(protoconv): Responses->Anthropic Messages request mapper

d6f6774

Co-Authored-By: Claude <noreply@anthropic.com>

feat(protoconv): Anthropic Messages->Responses response assembler

7a8646f

Co-Authored-By: Claude <noreply@anthropic.com>

feat(protoconv): Anthropic Messages SSE -> Responses SSE

283b306

fix(protoconv): balance streaming output_item added/done, carry item_…

88a37ab

…id on deltas, preserve array tool outputs

feat(modelproxy): route converted models through protoconv

2ccd275

feat(codex): add SetModel to rewrite only the model field

0b6d877

Co-Authored-By: Claude <noreply@anthropic.com>

feat(agentctl): add set-model subcommand

bf4229c

Co-Authored-By: Claude <noreply@anthropic.com>

feat(agentserver): add set-model subcommand

b72c4ed

Co-Authored-By: Claude <noreply@anthropic.com>

fix(codex): SetModel preserves existing per-user proxy token

3fa2d93

Co-Authored-By: Claude <noreply@anthropic.com>

fix(protoconv,modelproxy): always emit response.completed, drop think…

63b94f1

…ing blocks, test unknown-model pass-through Co-Authored-By: Claude <noreply@anthropic.com>

fix(protoconv): GLM model name is glm-5.2 on the gateway ([1m] reject…

1d8d572

…ed by /v1/messages) Co-Authored-By: Claude <noreply@anthropic.com>

fix(protoconv): Anthropic converter must set max_tokens (gateway requ…

b87fdc3

…ires it) Co-Authored-By: Claude <noreply@anthropic.com>

Revert "feat(codexdesktop): expose GLM/DeepSeek in Codex Desktop's ow…

1187521

…n model picker" This reverts commit c20df28.

yzs15 merged commit 30c6985 into master Jun 23, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: GLM-5.2 and DeepSeek-v4-pro support in Codex Desktop#12

feat: GLM-5.2 and DeepSeek-v4-pro support in Codex Desktop#12
yzs15 merged 32 commits into
masterfrom
worktree-glm-deepseek-codex-support

yzs15 commented Jun 22, 2026

Uh oh!

yzs15 commented Jun 23, 2026

Uh oh!

yzs15 commented Jun 23, 2026

Uh oh!

yzs15 commented Jun 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

yzs15 commented Jun 22, 2026

What

Why

How

Real-machine verification (box 9.0.16.110)

Notes

Uh oh!

yzs15 commented Jun 23, 2026

Update — verified end-to-end on box 9.0.16.110

Verified working

Bugs fixed in this round

What we didn't ship

Known limitation (not ours)

Uh oh!

yzs15 commented Jun 23, 2026

Uh oh!

yzs15 commented Jun 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Real-machine verification (box `9.0.16.110`)

Update — verified end-to-end on box `9.0.16.110`