Skip to content

feat(agent): route the agent through a Cloudflare AI Gateway on Kimi#270

Open
saada wants to merge 1 commit into
masterfrom
cloudflare-ai-gateway
Open

feat(agent): route the agent through a Cloudflare AI Gateway on Kimi#270
saada wants to merge 1 commit into
masterfrom
cloudflare-ai-gateway

Conversation

@saada

@saada saada commented Jun 9, 2026

Copy link
Copy Markdown
Member

What

The agent command now answers with a fast Kimi model (kimi-k2.5),
reached over a Cloudflare AI Gateway instead of the headless Google Gemini
CLI. The gateway is a single place to observe and cost-control every model call,
and lets us swap upstream providers by config.

Why

We wanted the agent off Gemini and onto a fast Kimi model, with all of its
traffic flowing through one gateway for visibility and spend control — without
standing up per-provider plumbing.

Changes

  • util.ai-gateway (new): an OpenAI-compatible client for the gateway's
    …/compat/chat/completions endpoint. Configurable custom-provider route
    (Moonshot by default), Bearer auth for the upstream key, and an optional
    cf-aig-authorization header for an Authenticated Gateway.
  • commands.agent: rewritten as a chat responder over the gateway. Drops the
    headless gemini CLI, the generated yetibot-tool.py tool bridge, the scratch
    workdir + sweep, and the GitHub App token minting that existed only to
    authenticate the CLI's git pushes.
  • Agent config moves to [:agent] (:model defaults to kimi-k2.5); gateway
    credentials live under [:cloudflare :ai-gateway]. The old
    [:gemini :agent :model] is still honored as a fallback.
  • The agent keeps drawing from the shared monthly budget; its per-run weight
    drops to a Kimi-sized estimate ($0.05) so the $5 pool isn't spent in a
    handful of runs.

Behavior change

The agent no longer opens pull requests or runs gh/git — a chat
completion has no autonomous tool-use loop. It now answers questions, writes
code, reviews, and debugs in-thread.

Image (banana) and video (veo) generation are unchanged and still call
Google Gemini directly.

Config to set before use

:agent {:model "kimi-k2.5"}
:cloudflare {:ai-gateway {:account-id ""
                          :gateway-id ""
                          :provider "custom-moonshot"
                          :api-key ""}}  ; Moonshot key

Create an AI Gateway with a Moonshot custom provider (base URL
https://api.moonshot.ai/v1); the provider route prefix is custom-<name>.

Testing

lein test for the touched namespaces is green: util.ai-gateway,
commands.agent, util.gemini, plus commands.banana/commands.veo (shared
budget ledger). A live !agent smoke test needs a provisioned gateway + key.

Swap the agent's Google Gemini CLI backend for a direct OpenAI-compatible
chat call routed through a Cloudflare AI Gateway, defaulting to the fast
kimi-k2.5 model. The gateway is one place to observe and cost-control every
model call.

- add util.ai-gateway: an OpenAI-compatible client for the gateway's compat
  endpoint, with a configurable custom-provider route (Moonshot by default)
  and an optional Authenticated Gateway header
- rewrite the agent as a chat responder: drop the headless gemini CLI, the
  generated yetibot-tool.py tool bridge, the scratch workdir, and the GitHub
  App token minting that existed only to authenticate the CLI's git pushes.
  The agent no longer opens PRs on its own.
- agent config moves to [:agent] (model defaults to kimi-k2.5); gateway
  credentials live under [:cloudflare :ai-gateway]
- the agent keeps drawing from the shared monthly budget; its per-run weight
  drops to a Kimi-sized estimate ($0.05) so the $5 pool isn't spent in a
  handful of runs

Image and video generation are unchanged and still call Google directly.
@saada saada requested a review from devth as a code owner June 9, 2026 01:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant