Skip to content

feat(agents): framework generic server — 5-interface standard for agents (#1101)#1403

Merged
kovtcharov-amd merged 1 commit into
mainfrom
claudia/task-f631a112
Jun 4, 2026
Merged

feat(agents): framework generic server — 5-interface standard for agents (#1101)#1403
kovtcharov-amd merged 1 commit into
mainfrom
claudia/task-f631a112

Conversation

@kovtcharov-amd

Copy link
Copy Markdown
Collaborator

Why this matters

Before, an agent could only be reached the way it hand-wrote support for — every package that wanted a REST or MCP surface reimplemented the server glue, and most agents simply never got one. This adds a framework-provided generic server: any Agent instance is now exposed through the same five interface modes (the gaia-bash PR #985 pattern) without implementing any server logic itself — interactive TUI, one-shot CLI (--prompt), pipe (stdin→stdout), OpenAI-compatible REST API (--api --port), and MCP stdio server (--mcp). An agent package wires one line — return run_agent_cli(MyAgent()) — and gets all five.

AgentServer reuses the existing REST schemas (gaia.api.schemas) and MCP tool conventions instead of duplicating them. When a parsed gaia-agent.yaml is supplied, its interfaces block gates the allowed modes: requesting a disabled interface fails loudly with an actionable error (per the fail-loudly rule) rather than degrading silently.

This is additive — a new src/gaia/agents/base/server.py plus tests, no existing behaviour changed. It does not depend on the #1102 agent-hub restructure; the gaia agent run CLI wiring lands with that work via entry-point discovery.

Implements the generic-server scope of #1101 and Key Decision #5 of docs/spec/agent-hub-restructure.mdx.

Test plan

  • PYTHONPATH=src python -m pytest tests/unit/test_agent_server.py -xvs → 27 passed
    • REST: serves a mock agent (/v1/chat/completions non-streaming + SSE, /v1/models, /v1/tools, /health), 404 on wrong model, 400 on missing user message
    • MCP stdio: initialize, tools/list, tools/call (incl. isError), prompts/list from manifest, unknown-method -32601, notification → no response, malformed-JSON -32700, full stdin→stdout loop
    • Pipe: stdin→stdout; empty stdin fails loudly
    • CLI + run_agent_cli dispatch; manifest-disabled interface → actionable error + exit 1
    • LLM is mocked — no live Lemonade server required
  • python util/lint.py --pylint --black --isort --fix → black & isort PASS; pylint clean on changed files (the one reported error is pre-existing os.geteuid debt in lemonade_installer.py, untouched here)

Closes part of #1101.

…nts (#1101)

Agents previously had to hand-roll their own server glue to be reachable as
anything other than an interactive session — every package reimplemented REST
and MCP wiring, and most simply never got an API or MCP surface at all. This
adds a framework-provided generic server so ANY Agent instance is exposed
through the same five interface modes (the gaia-bash PR #985 pattern) without
implementing server logic itself: interactive TUI, one-shot CLI (--prompt),
pipe (stdin→stdout), OpenAI-compatible REST API (--api --port), and MCP stdio
server (--mcp).

AgentServer wraps any Agent and reuses the existing REST schemas
(gaia.api.schemas) and MCP tool conventions rather than duplicating them.
run_agent_cli(agent, argv) is the single entry point a package wires to its
console script; it dispatches on the flags above. When a parsed gaia-agent.yaml
is supplied, its `interfaces` block gates the allowed modes — requesting a
disabled interface fails loudly with an actionable error instead of degrading
silently.
@github-actions github-actions Bot added tests Test changes agents labels Jun 4, 2026
@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Code Review — framework generic server (5-interface standard)

Approve with suggestions. This is a clean, well-tested additive module — a single src/gaia/agents/base/server.py plus a thorough unit suite, no existing behaviour touched. I reproduced the test run locally (27 passed) and confirmed black/isort are clean on both files. It correctly reuses gaia.api.schemas and the MCPAgent conventions rather than duplicating them, and the interface-gating matches the manifest's VALID_INTERFACES/Interfaces dataclass exactly. The one thing worth the author's attention: the non-streaming REST path runs a blocking process_query directly inside an async def handler — fine functionally, but it serializes concurrent requests.

Issues

🟢 Non-streaming /v1/chat/completions blocks the event loop (server.py:462-489). The handler is async def but calls self._run_query()agent.process_query() synchronously, so a long generation stalls every other request (including /health) until it finishes. The streaming path sidesteps this — Starlette iterates the sync _sse_stream generator in a threadpool — which makes the non-streaming path the odd one out. Worth noting this matches the existing shared server (src/gaia/api/openai_server.py:279 does the same), so it's not a regression; but since this module is meant to be the standard every agent inherits, the cleanest fix is to drop async so FastAPI threadpools it, or offload explicitly:

        from starlette.concurrency import run_in_threadpool

        content = await run_in_threadpool(self._run_query, user_message)

🟢 A manifest without an interfaces block disables every mode, including TUI (server.py:133-152). Interfaces defaults all five flags to False, and _ensure_interface only short-circuits when manifest is None or interfaces is None — never for the all-False default. So run_agent_cli(agent, [], manifest=m) where m omits interfaces fails even the default TUI with InterfaceNotSupportedError. That's defensible fail-loudly behaviour, but it's a sharp edge: an author must explicitly enable each mode they want. A one-line mention in the AgentServer/run_agent_cli docstring ("every interface a package serves must be listed in interfaces:, including tui") would save a confusing debugging session.

🟢 run_api double-checks the interface gate (server.py:540 and 393). run_api calls _ensure_interface(INTERFACE_API) and then build_api_app() calls it again. Harmless, just redundant — you can drop the check in run_api and let build_api_app own it.

🟢 _extract_content evaluates the inner .get unconditionally (server.py:230-232). result.get("result", result.get("response", result)) computes the "response" lookup even when "result" is present. Trivial; an if "result" in result ladder or result.get("result") or result.get("response") reads clearer, but not worth churn on its own.

Strengths

  • Genuinely reuses the framework instead of forking it — REST responses are the real gaia.api.schemas models, MCP tool/prompt definitions defer to MCPAgent when present and synthesize a usable schema from the tool registry otherwise, so plain agents get a working surface for free.
  • Error handling is on-spec for the fail-loudly ruleInterfaceNotSupportedError names what was requested, what the manifest declares, and exactly which key to flip; the broad except in handle_mcp_request is a legitimate JSON-RPC boundary (logs with context, returns a structured -32603) and is correctly annotated.
  • Test suite earns its keep — covers REST (non-stream + SSE, 404/400 paths), MCP (initialize/tools/list/tools/call incl. isError, notifications, malformed JSON, full stdin→stdout loop), pipe, CLI dispatch, and the manifest-gating failure, all without a live Lemonade server via the FakeAgent that bypasses the heavy __init__.

Notes (non-blocking)

  • The module has no callers yet — by design, per the PR description (the gaia agent run wiring lands with Agent Hub: Restructure — move production agents to hub/agents/ #1102). No docs are required until that user-facing surface exists, but a follow-up to docs/reference/cli.mdx / an SDK page should ship alongside it.
  • Consider a small test for --mcp dispatch through run_agent_cli (currently only --prompt/--pipe/--api are exercised) and for run_tui, to close the matrix.

Verdict

Approve with suggestions — no blocking issues. All four findings are 🟢 minor; the event-loop one is the only one with runtime impact and it merely matches the existing server's behaviour. Safe to merge; applying the suggestions (especially the threadpool offload, given this is framework-level code) would make it the better template for everything that inherits it.

@kovtcharov-amd kovtcharov-amd merged commit d12332a into main Jun 4, 2026
39 checks passed
@kovtcharov-amd kovtcharov-amd deleted the claudia/task-f631a112 branch June 4, 2026 04:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents tests Test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant