Skip to content

feat: GLM-5.2 and DeepSeek-v4-pro support in Codex Desktop#12

Merged
yzs15 merged 32 commits into
masterfrom
worktree-glm-deepseek-codex-support
Jun 23, 2026
Merged

feat: GLM-5.2 and DeepSeek-v4-pro support in Codex Desktop#12
yzs15 merged 32 commits into
masterfrom
worktree-glm-deepseek-codex-support

Conversation

@yzs15

@yzs15 yzs15 commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

What

Lets Codex Desktop (and the Codex CLI / Linux headless agentserver that share ~/.codex/config.toml) drive glm-5.2 and deepseek-v4-pro in addition to gpt-5.5, all through the existing local proxy + ModelServer account.

Why

Codex dropped Chat Completions wire support in v0.81.0 and only speaks the Responses API. gpt-5.5 works because the gateway speaks Responses for it; GLM and DeepSeek don't. A protocol-conversion layer in the local proxy is the only way to keep one upstream + one key + zero per-app config.

How

  • internal/protoconv (new): pure request-mapper + streaming adapter for Responses ⇄ Chat Completions and Responses ⇄ Anthropic Messages. Unit-tested in isolation, including a real captured GLM SSE regression fixture.
  • internal/modelproxy: routes per-request by reading model from the request body. gpt-5.5 passes through unchanged; converted models go through protoconv. Unknown models pass through with a log, preserving gpt-5.5 parity as the safe default.
  • internal/codex + CLIs: agentctl set-model <name> and agentserver set-model <name> rewrite only the model field in config.toml via the existing merge+backup logic, preserving the per-user proxy token.

Catalog (table-driven, names == upstream names):

model upstream path behavior
gpt-5.5 /v1/responses pass-through
glm-5.2 /v1/messages (Anthropic) converted
deepseek-v4-pro /v1/chat/completions (Chat) converted

Real-machine verification (box 9.0.16.110)

model non-stream stream
gpt-5.5 ✅ 200 (pass-through)
deepseek-v4-pro ✅ 200 (Responses⇄Chat) ✅ full SSE with output_text.delta
glm-5.2 ✅ 200 (Responses⇄Anthropic) ✅ full SSE with output_text.delta

Notes

  • v1 covers text + tool calls + streaming. File/audio/reasoning parity is a designed-for follow-up — the types admit them; converter is structured to extend.
  • scripts/windows-package-common.sh degrades the Authenticode check to a warning on Linux packaging hosts (no powershell.exe); the signature is re-verified at install time on the Windows target.

🤖 Generated with Claude Code

Zishu Yu and others added 30 commits June 17, 2026 20:49
Add GLM 5.2 1m and deepseek-v4-pro to Codex via a Responses->Chat
(deepseek) and Responses->Anthropic (glm) conversion layer in the
existing local proxy, per-request routed by model. Aligned with the
opencode-desktop-support 3-bucket routing already on origin/master.

Co-Authored-By: Claude <noreply@anthropic.com>
12 TDD tasks: protoconv catalog + two converters (chat for deepseek,
anthropic for glm) with contract-first streaming, proxy routing
integration, codex config regression, and set-model on both CLIs.

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
…ing blocks, test unknown-model pass-through

Co-Authored-By: Claude <noreply@anthropic.com>
…ed by /v1/messages)

Co-Authored-By: Claude <noreply@anthropic.com>
…ires it)

Co-Authored-By: Claude <noreply@anthropic.com>
Proves WriteAnthropicStreamAsResponses correctly emits output_text.delta
for real GLM /v1/messages SSE (converter is correct; live-serveConverted
GLM-stream delta loss is a separate proxy-path issue).

Co-Authored-By: Claude <noreply@anthropic.com>
The Responses API permits 'input' to be either an array of items or a bare
string for a single user message. The Anthropic mapper only handled the
array case; a string fell through with an empty messages list, which the
gateway rejected (1214 messages 参数非法) — but only on streaming requests,
which is what surfaced the bug.

Mirror the bare-string handling already in the Chat Completions mapper.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Cross-host Linux packaging boxes don't have powershell.exe, so the
script failed hard before. The Authenticode signature is re-verified
on the Windows target at install time by ensure-codex-desktop.ps1,
so degrade to a WARNING here rather than failing the build. The MZ
and size sanity checks above still run unconditionally.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ed model

The launcher writes ~/.codex/config.toml every time Codex Desktop starts,
via codex.ModelserverProxySettings + UpdateConfig. ModelserverProxySettings
inherited Model="gpt-5.5" from ModelserverSettings(), so every restart
clobbered the user's set-model choice back to gpt-5.5 — making
'agentctl set-model glm-5.2' useless: it survived until the next launch.

Split the two concerns at the Settings level:

- ModelserverProxySettings now leaves Model empty (it is a provider-only
  update; the bearer token and base URL are the only things it must own).
- UpdateConfig, when given Model="", preserves whatever model is already
  in the file. It only seeds the default (gpt-5.5) when the field is
  absent (first write).

SetModel is unchanged (it explicitly sets Model), so user choice still
wins via either CLI or the merge.

Regression test covers the launcher-restart cycle end-to-end.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ed/completed

Codex's Responses SSE parser rejects streams whose response.completed event
omits the response id with:

  stream disconnected before completion: failed to parse ResponseCompleted:
  missing field `id`

Both converters were emitting:
  data: {"response":{"status":"completed"},"type":"response.completed"}

Now they capture the upstream identity (Anthropic: message_start.message.id/model;
Chat: each chunk's top-level id/model — first non-empty wins) and include it in
both response.created and response.completed, alongside status (in_progress /
completed). response.created is emitted lazily on first knowledge of the id,
with a fallback so an empty stream still produces a well-formed pair.

Regression tests added on both sides assert id+model are present in both events.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codex's Responses parser uses 'if let Ok(item) = ...' on output_item.added
and output_item.done payloads. Items that fail to deserialize as
ResponseItem are silently dropped — no error, no UI text. We were sending
the minimum shape:

  { type: 'message', id: 'msg_1' }
  { type: 'function_call', id: 'c1', call_id: 'c1', name: 'run' }

Neither parses: ResponseItem::Message requires {role, content}, and
ResponseItem::FunctionCall requires {name, arguments, call_id}. The
streaming response.completed event landed fine (id was correct), so Codex
saw a 'completed' turn with zero output items — hence 'no result' even
when our SSE looked syntactically OK on the wire.

Accumulate text and tool-arg deltas across the stream and emit fully
populated items at close time, on both the Anthropic and Chat sides:

  output_item.added: { type:'message', id, role:'assistant', content:[] }
  output_item.done:  { ...,  content:[{type:'output_text', text:'<accum>'}] }

  output_item.added: { type:'function_call', id, call_id, name, arguments:'' }
  output_item.done:  { ...,  arguments:'<accumulated json>' }

New tests on both converters parse every output_item.done frame and assert
role/content for messages and name/arguments/call_id for tool calls. Match
the schema in openai/codex codex-rs/protocol/src/models.rs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codex Desktop's built-in model picker has not supported custom providers
since 2026/02 (openai/codex #10867, #15138, #15364, #19694, #22160, #29156
— all open, no PR). To unblock users, expose model selection in our own
control surface.

End-to-end pieces:

- protoconv.Route gains DisplayName; new Catalog() returns a copy of the
  full table for UI consumption. KnownModels()/LookupRoute unchanged.
- codex.CurrentModel(path) reads the model field from config.toml
  (pure read; returns gpt-5.5 default when absent).
- console.State exposes current_model + available_models, populated from
  the catalog + CurrentModel(deps.CodexConfigFile).
- console.Controller.SetCodexModel validates against the catalog and calls
  codex.SetModel (which already preserves bearer token, base_url, etc.).
- ui.ConsoleController gains SetCodexModel; server adds POST
  /api/console/model behind the existing trusted-mutation token check.
- launcher passes CodexConfigFile into console.Deps.

Frontend (Vue 3 + Element Plus):

- api.ts: ConsoleState gets current_model + available_models;
  setConsoleModel() POSTs to /api/console/model with the token header.
- Dashboard.vue: new card under the connection grid (codex_desktop only),
  el-radio-group of available models, on change calls setConsoleModel +
  refresh + ElMessage toast: "已切换到 X. 新建 Codex 对话生效(旧对话保持原模型)".

Vitest covers picker rendering + click triggers setConsoleModel + refresh.
Vite dist is committed (it's allow-listed in internal/ui/assets/.gitignore
and embedded by //go:embed in internal/ui/server.go).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codex sends prompt-author instructions with role="developer" (a Responses
API role). DeepSeek's Chat Completions endpoint rejects it with:

  developer is not one of ['system', 'assistant', 'user', 'tool', 'function']
   - 'messages.[0].role'

The Anthropic converter already collapses {system, developer} message items
into a merged top-level system parts list; mirror that on the Chat side:
collect them into systemParts, prepend a single 'system' message before
the rest. Other roles fall through unchanged.

Regression test covers instructions + developer + inline system → single
merged system message, no developer leak.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…an EOF

After receiving [DONE] the converter broke out of the scan loop, leaving
trailing bytes (blank lines, usage chunks DeepSeek sometimes emits after
[DONE]) unread. The modelproxy's defer resp.Body.Close() then closed the
body short of EOF, which net/http handles by RST-ing the underlying TCP
connection rather than returning it to the keep-alive pool.

The upstream gateway (code.ai.cs.ac.cn) interprets that RST as an
abnormal client disconnect: the model run's turn state never transitions
from in_progress, so the modelserver dashboard kept showing 'processing'
indefinitely — even though Codex Desktop had already rendered the
finished reply.

Drain to EOF with io.Copy(io.Discard, r) before break. The Anthropic
converter is unaffected (its loop runs to EOF naturally; the gateway
closes the connection after message_stop).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When the user hits 'open frontend' (or the same path on opencode/vscode),
launcher always calls codex.UpdateConfig to refresh model_provider /
base_url / bearer_token. UpdateConfig writes atomically: write tmp,
rename onto config.toml. If Codex Desktop is already running, it holds a
read handle on config.toml; Windows then rejects the rename with
ERROR_ACCESS_DENIED, and the launcher returned 500 to the UI even though
the launch itself would have succeeded.

Two-layer fix:

1. codex/config.go: writeConfigFile now retries rename with short
   backoffs (~250ms total). Rides out brief reader-hold races without
   any caller-side knowledge.
2. cmd/launcher/main.go: the three launch paths (Codex Desktop,
   OpenCode Desktop, VSCode) now log+continue on UpdateConfig failure
   instead of returning the error. The running Codex wouldn't reload
   the file anyway, so the rewrite is best-effort once Codex is up.

First-time launch (Codex not running, file unlocked) and
agentctl/agentserver set-model writes (user-initiated, Codex picks up on
next launch) are unaffected — they still see hard errors.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…picker

Codex Desktop's picker filters out anything outside its bundled OpenAI
catalog. openai/codex #15138, #19694, #22160, #29156 — all open since
2026/02. The official escape hatch is the (undocumented but live)
model_catalog_json field, which fully replaces the bundled catalog
with whatever JSON file it points at.

We now emit that file on every Codex launch:

- internal/codexdesktop/catalog.go (+model_template.json embed)
  Renders agentserver_model_catalog.json by cloning the bundled
  gpt-5.5 model row (extracted from openai/codex@main models.json)
  and overriding only slug+display_name per protoconv.Catalog() entry.
  Cloning the full row ensures every required ModelInfo field stays
  present so Codex's strict deserialize succeeds. Atomic write.
- internal/codex Settings: new ModelCatalogJSON field; UpdateConfig
  writes it as top-level model_catalog_json. Empty leaves existing
  value untouched.
- internal/paths: CodexModelCatalogFile = ~/.codex/agentserver_model_catalog.json
- cmd/launcher: launchCompletedCodexDesktop writes the catalog then
  points config at it. Both steps tolerate failure (logged) — picker
  loses the named affordance but model selection from our own console
  still works and Codex falls back to its bundled catalog.
- Dashboard subtitle nudges users that Codex's own picker now works too.

Refresh model_template.json when Codex adds required ModelInfo fields
across releases (otherwise catalog load fails and picker silently
falls back).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@yzs15

yzs15 commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator Author

Update — verified end-to-end on box 9.0.16.110

Final state after on-box testing + several iterations of bug fixes:

Verified working

model non-stream stream Codex Desktop UI
gpt-5.5 ✅ pass-through
deepseek-v4-pro ✅ Responses↔Chat ✅ full SSE w/ deltas
glm-5.2 ✅ Responses↔Anthropic ✅ full SSE w/ deltas

Switching: agentctl set-model <name> (CLI) or the new selector card in the 星池指挥官 dashboard. New conversations pick up the change; old threads stay on the model they were created with (Codex Desktop persists per-thread model in state_5.sqlite).

Bugs fixed in this round

  • fix(protoconv): handle bare-string input in Anthropic request mapper — gateway returned 1214 messages 参数非法 when Codex sent input as a string instead of an array.
  • fix(codex): launcher's provider-only update must preserve user-selected model — launcher rewrote model = "gpt-5.5" on every Codex Desktop start, undoing user's set-model choice.
  • fix(protoconv): carry upstream id+model into streaming response.created/completed — Codex parser fails with missing field id without these.
  • fix(protoconv): emit ResponseItem-shaped output_item events for Codex — message items needed role + content, function_call needed arguments. Without these, Codex's if let Ok(item) = ... silently dropped every item — stream looked complete, UI got no text.
  • fix(protoconv): merge developer role into system for Chat Completions — DeepSeek rejected the developer role; merge into a single leading system message (mirrors the Anthropic mapper).
  • fix(protoconv): drain Chat SSE body after [DONE] so upstream sees clean EOF — leaving trailing bytes unread caused http.Client to RST the TCP connection; the upstream then never marked the turn complete.
  • fix(launcher): tolerate locked config.toml when launching Codex Desktop — Codex Desktop holds config.toml open while running; os.Rename returned ERROR_ACCESS_DENIED and the launch path 500'd. UpdateConfig now retries rename, and the launcher logs+continues on failure (the running Codex wouldn't reload the file anyway).
  • feat(console-ui): model selector in 星池指挥官 dashboard — Vue 3 + Element Plus card under the connection grid, fed by protoconv.Catalog(). Codex Desktop's own picker is broken for custom providers (openai/codex #15138 etc., all still open), so we expose model switching in our own surface.

What we didn't ship

Tried to make Codex Desktop's own picker show GLM/DeepSeek via model_catalog_json (commit c20df28, reverted in 1187521). Verified directly: the catalog file was loaded correctly and the gateway's /v1/models returned all 31 models, but Desktop's frontend has a hardcoded slug allow-list that filters everything to OpenAI-prefixed names regardless. This is the same family of bugs as openai/codex #15138 / #19694 / #29156 — it can't be fixed from the config side. Reverted to avoid feeding incorrect metadata (cloned from gpt-5.5's row) to Codex's core, which could trigger wrong reasoning-level / context-window decisions.

Known limitation (not ours)

The modelserver gateway dashboard shows DeepSeek conversations as permanently processing with 0 bytes and a - ID. Verified the daemon receives the full SSE stream and forwards it correctly to Codex; gpt-5.5 (openai_responses) and glm-5.2 (anthropic_messages) rows both transition to success with full stats. This is a missing statistics hook on the openai_chat_completions provider type in the gateway itself — to be reported separately to the modelserver team.

P1 (real, blocking for direct-config users): SetModel previously routed
through UpdateConfig with a partial Settings, which deleted env_key when
Settings.EnvKey == "". A valid direct-provider config (env_key =
"OPENAI_API_KEY", no bearer token) silently became a proxy config (no
env_key, legacy bearer token), breaking auth on the next Codex start.
Rewrite SetModel as a self-contained one-field rewriter that touches
only the top-level 'model' key, leaving every provider field untouched.
First-call-on-missing-file still seeds the proxy defaults so headless
agentserver/agentctl set-model on a fresh install keeps working.
Regression covers both direct-config preservation and fresh-install seeding.

P2 (real, latent): chat.go / anthropic.go forwarded root["stream"] as-is,
yielding "stream": null on the wire when Codex omitted the field. Codex
Desktop always sets it so the live path didn't fire, but other Codex
clients (exec, review, app-server probes) may not. Forward only when
present as a bool.

P3 (real): execVSCode tolerated codex.UpdateConfig failures with a
log+continue, copy-pasted from the Codex Desktop / OpenCode launch
paths. The rationale there is that the Desktop app holds config.toml
open and the rename loses the race; VS Code does NOT hold the file
open, so the failure is real and must propagate (the Codex extension
reads config.toml at session creation).

P4 (real, correctness): Chat converter dropped parallel_tool_calls,
tool_choice, and function tools' strict flag. Codex 0.142 sends
parallel_tool_calls per model_info.supports_parallel_tool_calls and
may set tool_choice; dropping either lets the upstream silently
disregard client intent. Forward both, and preserve strict in the
tool list. Anthropic converter does NOT forward these on purpose:
parallel_tool_calls is not part of the Anthropic Messages API, and
tool_choice has a different object shape than Codex's string —
explicit comments document the omissions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@yzs15

yzs15 commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator Author

Addressed the four review findings in d60aa06:

# Status Fix
P1 SetModel corrupts direct configs ✅ confirmed real SetModel rewritten as a one-field rewriter; no longer routes through UpdateConfig. Direct provider configs (env_key-based) survive intact; missing-file path still seeds proxy defaults so agentctl set-model on a fresh headless install still works. Two regression tests cover both axes.
P2 "stream": null for converted requests ✅ confirmed real (latent for Codex Desktop but trips other Codex clients) Forward stream only when present as a bool, on both Chat and Anthropic sides. Regression test on both.
P3 VSCode launch hides config write failures ✅ confirmed real execVSCode propagates UpdateConfig errors. The locked-file rationale only applies to Codex/OpenCode Desktop paths where the target app holds config.toml open; VS Code does not, so the failure must surface (the Codex extension reads the config at session creation).
P4 dropped parallel_tool_calls / tool_choice / strict ✅ confirmed real (correctness) Chat: forward parallel_tool_calls + tool_choice when source sets them; preserve strict per-tool. Anthropic: deliberately not forwarded (parallel_tool_calls is not part of Anthropic Messages; tool_choice has a different object shape than Codex's string and there's no safe pass-through without a shape mapping). Both omissions are documented inline. Two regression tests on the Chat side.

All four bugs verified by reading the actual code paths, not taken on faith. Tests + full suite green.

…ew P5)

Verified the report: chat.go was forwarding root["reasoning"] verbatim
when present (even if null). The reviewer is correct that this is the
same family of bug as the stream-null fix, but it has a second
correctness layer beyond null-handling:

  reasoning is a Responses-API field. Chat Completions has no top-level
  reasoning. DeepSeek uses thinking: {type, reasoning_effort}; Anthropic
  uses thinking: {type, budget_tokens}; OpenAI Chat does not support
  reasoning at all. So even when Codex sends a non-null reasoning
  object, forwarding it raw produces a meaningless (or rejected) field.

Drop the forward entirely. Per-upstream mapping (Responses reasoning ->
DeepSeek thinking / Anthropic thinking) is a deferred feature; doing
nothing is strictly safer than forwarding the wrong-shape field.

The Anthropic converter never forwarded the top-level reasoning object
(it only handled inline 'reasoning' input items, which it drops), so
no change there.

Regression: three cases — reasoning as object, reasoning as null,
reasoning absent — all must produce a Chat body without a reasoning
key.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@yzs15

yzs15 commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator Author

Fixed in 78d42d4. Confirmed real (verified against DeepSeek docs: top-level reasoning is not a valid Chat request field; DeepSeek uses thinking: {type, reasoning_effort}). Per-upstream mapping is a deferred feature; for now the Chat converter drops reasoning entirely. Anthropic converter was never affected (it doesn't forward the top-level field). Regression test covers reasoning-as-object, reasoning-as-null, and reasoning-absent — all must produce a Chat body without a reasoning key.

@yzs15 yzs15 merged commit 30c6985 into master Jun 23, 2026
4 checks passed
yzs15 pushed a commit that referenced this pull request Jun 23, 2026
After PR #12, the local proxy is no longer just a token-refresh forwarder
— it does per-model protocol conversion (gpt-5.5 passthrough,
deepseek-v4-pro -> Chat Completions, glm-5.2 -> Anthropic Messages) and
ships a model-picker affordance in the console UI plus
agentctl/agentserver set-model. Update both user-facing docs so the
positioning matches the actual product:

- README.md: rename the bullet to '共享模型访问路径 + 多模型路由', list
  the per-model routing and the two ways to switch.
- 项目描述.md: replace the 'forwards to .../v1' note with a routing
  table and document the switch UX plus the new-conversation-only
  semantics (old threads keep their creation-time model — Codex Desktop
  persists model per thread in state_5.sqlite).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant