Skip to content

feat(agents): add Algolia Agent Studio command tree (full API surface)#212

Open
cmarguta-alg wants to merge 29 commits intoalgolia:mainfrom
cmarguta-alg:lab_week_3
Open

feat(agents): add Algolia Agent Studio command tree (full API surface)#212
cmarguta-alg wants to merge 29 commits intoalgolia:mainfrom
cmarguta-alg:lab_week_3

Conversation

@cmarguta-alg
Copy link
Copy Markdown

@cmarguta-alg cmarguta-alg commented May 6, 2026

🤔 What

Adds algolia agents backed by api/agentstudio: agent lifecycle (list, get, create, update, delete, publish, unpublish, duplicate), completions (run on a persisted id, try on a local --config file hitting /1/agents/test/completions), plus conversations, domains, providers, keys, feedback, user-data (alias userdata), cache, config, and hidden internal. Destructive or sensitive paths require --confirm / -y on non-TTY, same pattern as elsewhere in the CLI.

🤷 Why

Agent Studio workflows are cumbersome to reproduce with one-off HTTP calls or dashboard-only flows; exposing them in the Algolia CLI makes automation, CI, and support playbooks repeatable.

🛠 How

Area Change
api/agentstudio Typed client modules per domain; shared transport; sentinel-style error envelopes; SSE parsing aligned with AI SDK v4/v5 streaming.
pkg/cmd/agents/* One package area per backend concern; default masking on provider/key secrets (--show-secret to print raw values); guardrails spelled out in docs/agents.md (conversation purge bounds, / in user tokens, memory route prefix).
Factory / config / build AgentStudioClient on cmdutil.Factory; agent_studio_url plus ALGOLIA_AGENT_STUDIO_URL (see docs/agents.md / root agents --help for precedence).
E2E e2e/testscripts/agents/*.txtar for offline validation paths; gated live agents list when ALGOLIA_AGENT_STUDIO_E2E=1.
Full algolia agents leaf commands — Cobra UseLine + command-scoped flags
Every command also inherits the standard algolia connection flags (--application-id, --api-key for the Algolia Search credential, --profile, --search-hosts). Those globals are omitted from each listing unless this command adds a different --api-key (the provider shortcuts on agents providers create|update).

### algolia agents cache invalidate <agent-id> [--before YYYY-MM-DD] [--confirm] [flags]
- --before, --confirm (-y)

### algolia agents config get [flags]
- --allow-missing-template-keys, --output (-o), --template

### algolia agents config set [--retention-days N] [-F file] [flags]
- --allow-missing-template-keys, --file (-F), --output (-o), --retention-days, --template

### algolia agents conversations delete <agent-id> <conversation-id> [--confirm] [flags]
- --confirm (-y)

### algolia agents conversations export <agent-id> [--start-date YYYY-MM-DD] [--end-date YYYY-MM-DD] [--output-file path] [flags]
- --allow-missing-template-keys, --end-date, --output-file (-O), --output (-o), --start-date, --template

### algolia agents conversations get <agent-id> <conversation-id> [flags]
- --allow-missing-template-keys, --include-feedback, --output (-o), --template

### algolia agents conversations list <agent-id> [flags]
- --allow-missing-template-keys, --end-date, --feedback-vote, --include-feedback, --output (-o), --page, --per-page, --start-date, --template

### algolia agents conversations purge <agent-id> (--start-date YYYY-MM-DD | --end-date YYYY-MM-DD) [--confirm] [flags]
- --confirm (-y), --end-date, --start-date

### algolia agents create -F <file> [flags]
- --allow-missing-template-keys, --file (-F), --output (-o), --template

### algolia agents delete <agent-id> [--confirm] [flags]
- --allow-missing-template-keys, --confirm (-y), --output (-o), --template

### algolia agents domains bulk-delete <agent-id> [--domain-id x ...] [-F file] [--confirm] [flags]
- --confirm (-y), --domain-id, --file (-F)

### algolia agents domains bulk-insert <agent-id> [--domain x ...] [-F file] [flags]
- --allow-missing-template-keys, --domain, --file (-F), --output (-o), --template

### algolia agents domains create <agent-id> --domain <pattern> [flags]
- --allow-missing-template-keys, --domain, --output (-o), --template

### algolia agents domains delete <agent-id> <domain-id> [--confirm] [flags]
- --confirm (-y)

### algolia agents domains get <agent-id> <domain-id> [flags]
- --allow-missing-template-keys, --output (-o), --template

### algolia agents domains list <agent-id> [flags]
- --allow-missing-template-keys, --output (-o), --template

### algolia agents duplicate <agent-id> [flags]
- --allow-missing-template-keys, --output (-o), --template

### algolia agents feedback create --agent-id <id> --message-id <id> --vote 0|1 [--tags x,y] [--notes ...] [flags]
- --agent-id, --allow-missing-template-keys, --message-id, --notes, --output (-o), --tags, --template, --vote

### algolia agents get <agent-id> [flags]
- --allow-missing-template-keys, --output (-o), --template

### algolia agents keys create --name <name> [--agent-id <id> ...] [flags]
- --agent-id, --allow-missing-template-keys, --name, --output (-o), --show-secret, --template

### algolia agents keys delete <id> [--confirm] [flags]
- --confirm (-y)

### algolia agents keys get <id> [flags]
- --allow-missing-template-keys, --output (-o), --show-secret, --template

### algolia agents keys list [flags]
- --allow-missing-template-keys, --limit, --output (-o), --page, --show-secret, --template

### algolia agents keys update <id> [--name <name>] [--agent-id <id> ...] [flags]
- --agent-id, --allow-missing-template-keys, --name, --output (-o), --show-secret, --template

### algolia agents list [flags]
- --allow-missing-template-keys, --output (-o), --page, --per-page, --provider-id, --template

### algolia agents providers create (-F <file> | --name <name> --provider <type> (--api-key <key> | --api-key-stdin | --api-key-env <var>)) [flags]
- --allow-missing-template-keys, --api-key-env, --api-key-stdin, --api-key, --base-url, --file (-F), --name, --output (-o), --provider, --show-secret, --template

### algolia agents providers defaults [flags]
- --allow-missing-template-keys, --output (-o), --template

### algolia agents providers delete <provider-id> [--confirm] [flags]
- --confirm (-y)

### algolia agents providers get <provider-id> [flags]
- --allow-missing-template-keys, --output (-o), --show-secret, --template

### algolia agents providers list [flags]
- --allow-missing-template-keys, --output (-o), --page, --per-page, --show-secret, --template

### algolia agents providers models [--provider-id <id>] [flags]
- --allow-missing-template-keys, --output (-o), --provider-id, --template

### algolia agents providers update <provider-id> (-F <file> | [--name <name>] [--api-key <key> | --api-key-stdin | --api-key-env <var>] [--base-url <url>]) [flags]
- --allow-missing-template-keys, --api-key-env, --api-key-stdin, --api-key, --base-url, --file (-F), --name, --output (-o), --show-secret, --template

### algolia agents publish <agent-id> [flags]
- --allow-missing-template-keys, --output (-o), --template

### algolia agents run <agent-id> [--input <file> | --message <text>] [flags]
- --compatibility, --input (-i), --message (-m), --ndjson, --no-analytics, --no-cache, --no-memory, --no-stream, --secure-user-token

### algolia agents try --config <file> [--input <file> | --message <text>] [flags]
- --compatibility, --config (-c), --input (-i), --message (-m), --ndjson, --no-analytics, --no-cache, --no-memory, --no-stream, --secure-user-token

### algolia agents unpublish <agent-id> [flags]
- --allow-missing-template-keys, --output (-o), --template

### algolia agents update <agent-id> -F <file> [flags]
- --allow-missing-template-keys, --file (-F), --output (-o), --template

### algolia agents user-data delete <user-token> [--confirm] [flags]
- --confirm (-y)

### algolia agents user-data get <user-token> [flags]
- --output-file (-o)

### Hidden agents internal (not included in describe agents; unstable)

The internal group is omitted from algolia describe agents because the parent command is Hidden.

#### algolia agents internal status [flags]
- Structured output bundle: --output (-o), --template, --allow-missing-template-keys (same pattern as other read verbs emitting JSON).

#### algolia agents internal memorize <agent-id> [--body <json> | -F file] [flags]
#### algolia agents internal ponder <agent-id> [--body <json> | -F file] [flags]
#### algolia agents internal consolidate <agent-id> [--body <json> | -F file] [flags]
- --body, --file (-F), plus the structured output bundle above.

🧪 Testing

Quick checks

task test
task lint
task build

Packages

go test ./api/agentstudio/... -count=1
go test ./pkg/cmd/agents/... -count=1

E2E (TestAgents, offline scripts; synthetic app credentials suffice)

ALGOLIA_APPLICATION_ID=test ALGOLIA_API_KEY=test \
  go test ./e2e -tags=e2e -run '^TestAgents$' -count=1

E2E (add live agents list envelope check)

ALGOLIA_APPLICATION_ID=<app-id> ALGOLIA_API_KEY=<api-key> ALGOLIA_AGENT_STUDIO_E2E=1 \
  go test ./e2e -tags=e2e -run '^TestAgents$' -count=1

Action / expectation

  • When algolia agents conversations purge "<agent-id>" -y runs in a non-interactive shell without --start-date or --end-date, then the process exits non-zero before a purge request leaves the CLI (bounded delete is mandatory).
  • When algolia agents user-data get 'ab/cd' (token contains /), then the CLI exits non-zero with the gateway-routing advisory (avoiding a bogus 404 from premature decoding).
Optional staging / manual probes (substitute real ids and JSON paths)
./algolia agents list --output json
./algolia agents create -F agent.json
./algolia agents try -c draft.json -m "hi" --compatibility v5 --no-stream
./algolia agents run <agent-id> -m "Hello" --no-stream
./algolia agents conversations purge <agent-id> --start-date 1970-01-01 -y
./algolia agents domains bulk-insert <agent-id> --domain a.example.com --domain b.example.com
./algolia agents keys list
./algolia agents keys list --show-secret --output json
./algolia agents internal status

⚠️ Dependencies / risk

  • Reuses profile Application ID + Search API key; mutating agents keys verbs require an admin key (HTTP 403 is mapped deterministically).
  • ALGOLIA_AGENT_STUDIO_URL is optional—when unset the CLI follows the docs/agents.md resolver (profile region → baked default URL → proxy fallback).

cmarguta-alg and others added 3 commits May 6, 2026 15:23
Introduces a stand-alone, hand-written client in api/agentstudio/ that
later cmd/agent-studio/* commands will sit on top of.

- host.go: resolve base URL from --agent-studio-url override, region
  (eu/us, staging only in eu), or {appID}.algolia.net/agent-studio
  cluster-proxy fallback.
- types.go: Go structs for Agent and PaginatedAgentsResponse, matching
  the camelCase wire format produced by the conversational-ai backend.
- errors.go: structured APIError + sentinels (ErrUnauthorized,
  ErrForbidden, ErrFeatureDisabled, ErrNotFound, ErrServer) so callers
  can branch without parsing strings.
- client.go: ListAgents end-to-end, with X-Algolia-Application-Id /
  X-Algolia-API-Key / X-Algolia-User-ID headers and JSON-error parsing.

Tests cover region+env+override+fallback resolution, request/response
round-trips, pagination params, context cancellation, and every
documented HTTP error mapping. Verified live against a beta app on the
deployed conversational-ai backend; no factory wiring yet.

A TODO marks future replacement with a generated client once the
backend publishes one from specs/agent-studio/spec.yml.

Co-authored-by: Cursor <cursoragent@cursor.com>
Wires a new agentstudio client into the factory and exposes the first
two read-only commands. `list` supports --page, --per-page,
--provider-id, and --output (table on TTY, JSON for piping).
`get <id>` defaults to JSON.

Host resolution uses, in order:
  1. ALGOLIA_AGENT_STUDIO_URL env var, or the profile's
     "agent_studio_url" field
  2. Build-time agentstudio.DefaultBaseURL (ldflag-injected from
     .env's ALGOLIA_AGENT_STUDIO_URL — internal beta builds set
     this to https://agent-studio.staging.eu.algolia.com)
  3. Cluster-proxy fallback https://<appID>.algolia.net/agent-studio
     (recommended for production end-users — the application's own
     cluster routes the request to the right region, per the
     conversational-ai README)

Verified live against the beta backend (staging EU) using only `task
build` defaults; fixes a regression where 422 validation errors were
surfaced with the generic "Input is invalid, see detail/body:" prefix
instead of the structured FastAPI detail.msg — extractDetail now
prefers structured detail[] over the generic message field.

Co-authored-by: Cursor <cursoragent@cursor.com>
Six new mutating commands under `algolia agents`, completing parity
with the conversational-ai backend's `/v1/agents/*` surface.

Body for `create`/`update` is passed as opaque `json.RawMessage`. The
AgentConfigCreate schema is large, deeply validated, and evolves
frequently — the CLI is a pass-through; the backend validates; our
422-detail surfacing (Phase 2 fix) makes errors actionable. Mirroring
the schema in Go would lie about parity and force a CLI release on
every backend change.

`--dry-run` shows the resolved request body in human mode (more
useful than the existing `objects/update` count-only summary), and
emits a structured `{action, request, source, bytes, body, dryRun}`
envelope only when `--output` is *explicitly* set — `WithDefaultOutput
("json")` from the success path no longer steals the human dry-run
path silently.

`delete` follows `objects/delete` exactly for the confirmation
contract: `-y/--confirm` skips the prompt; non-TTY without `--confirm`
refuses with `cmdutil.FlagErrorf`; `--dry-run` GETs the agent and
prints "would DELETE" without calling DELETE. Pre-fetching lets us
show name + status both at the prompt and in dry-run output.

DELETE handles 204 No Content correctly (no body decode); the other
five mutating endpoints share `doAgentMutation` (POST/PATCH + JSON
Agent response).

Live-verified end-to-end against staging EU: 13/13 checks pass —
real create, get, update (rename), duplicate, delete; publish
correctly surfaces backend `409 Cannot publish an agent without a
model or provider` for a model-less smoke agent.

Co-authored-by: Cursor <cursoragent@cursor.com>
@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented May 6, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 1390 complexity · 859 duplication

Metric Results
Complexity 1390
Duplication 859

View in Codacy

TIP This summary will be updated as you push new changes.

cmarguta-alg and others added 4 commits May 6, 2026 22:02
Two new commands close the developer loop on the agents tree:
`agents test` for iterating on an in-memory agent configuration
without persisting it (backend special-cases agent_id="test"), and
`agents run <id>` for invoking a persisted agent — the same call
shape downstream apps make in production.

The wire format is not standard SSE. Recon caught this before code
was written: v5 IS standard SSE (`data: <json>\n\n` with a `[DONE]`
sentinel), but v4 is a Vercel-AI bespoke protocol — `<type>:<json>\n`
per line, no terminator, with 16 single-character type codes mapped
to names (`0`=text, `9`=tool-call, `d`=finish-message, …). The
`compatibilityMode` query parameter is mandatory server-side with
no default; the CLI defaults to v5 (standard, easier to parse
defensively) and exposes `--compatibility v4|v5`. The new
`api/agentstudio/sse.go` parser sniffs the line prefix and handles
both formats transparently — same approach the backend's own
acceptance test helpers use.

`client.Completions(...)` returns the raw `*http.Response` so the
caller decides streaming vs buffered based on `Content-Type`. One
method instead of two avoids duplicating request building for one
endpoint. The error path closes the body explicitly (the existing
`checkResponse` only drains a 64 KiB prefix).

Streaming output is NDJSON regardless of TTY — one parsed event per
line as `{"type":"...","data":{...}}`. Predictable; pipes cleanly
into `jq -r 'select(.type=="text-delta") | .data.delta'` for the
"just show me the text" use case. Rich TTY rendering (incremental
token print + dim tool-calls) is deferred — not worth growing this
PR another ~150 LOC for a UX nicety.

`--no-stream` returns the buffered JSON response verbatim.
`--dry-run` prints the resolved body and skips the HTTP call.
`--message` is a one-shot convenience that wraps as
`[{"role":"user","content":"<text>"}]`; `--input` reads a messages
JSON array from a file or `-` (stdin); cobra's
`MarkFlagsMutuallyExclusive` enforces exactly one.

`signal.NotifyContext(SIGINT)` is wired locally inside `test/run`,
not in `root.go`. Streaming commands are the only ones that need
mid-call cancellation; touching the global root context risked
silently regressing every other command's signal behavior.

Backend's discriminated `messages` union is pre-validated client-
side with a cheap `[]any` shape check so the obvious "passed an
object instead of an array" case names the file in the error
instead of producing a 422.

Live-verified end-to-end against staging EU: 11/11 checks pass.
The test app has no LLM provider configured, so a real completion
can't run — but the 422s we get back ("Field required" from the
test endpoint, "Agent is not published" from `agents run` against
the existing draft agent) confirm URL routing, query params,
headers, body framing, and 422-detail extraction all land where
intended. The streaming parser path is unit-tested against a
synthetic SSE source (8 cases covering both wire formats, malformed
lines, large frames, [DONE], onEvent error propagation).

Other pieces:
- `NormalizeCompatibility` lives in `pkg/cmd/agents/shared/`
  (extracted on second use, same rule as `printDryRun` in Phase 3).
- `BuildMessages` / `ReadJSONFile` / `MarshalCompletionBody` /
  `RenderCompletion` shared between the two commands; all unit-
  tested in isolation.

Co-authored-by: Cursor <cursoragent@cursor.com>
…idation order

Three pieces that round out the agents tree for review.

AGENTS.md grew an "Agent Studio" section so the next contributor
doesn't replay the recon: architecture, auth model, base-URL
resolution priority chain, SSE wire format gotchas (v4 IS NOT
standard SSE; `compatibilityMode` is mandatory; CLI defaults to
v5), dry-run convention (`cmd.Flags().Changed("output")` gating,
not `PrintFlags.HasStructuredOutput()`), and a brief on why no
bespoke telemetry events were added.

Two new e2e testscripts:

- testscripts/agents/dry-run.txtar (ungated): the contract test
  that the dry-run paths still produce stable output through the
  full cobra tree under the e2e binary. Exercises create/update
  structured + human modes, test with --input/--message, run with
  the agent id in the path, and the flag-validation guards.
  Covers the create/update/test/run dry-run paths in one file
  (delete dry-run is exercised in unit tests; it doesn't fit the
  same shape because it pre-fetches the agent and so still needs
  network).

- testscripts/agents/list.txtar (env-gated on
  ALGOLIA_AGENT_STUDIO_E2E=1): read-only smoke confirming the
  live backend still emits the wire format we parse against.
  Gated because the standard e2e test app doesn't necessarily
  have the gen_ai feature flag enabled — running this against
  the wrong app would 403 and fail the suite.

To make the env-gating possible, registered an `envCondition`
in e2e_test.go that exposes `[env:VAR]` and `[!env:VAR]` to all
testscripts. The cli/go-internal testscript framework only ships
GOOS/GOARCH conditions out of the box; this is a one-liner that
treats "" and "0" as falsy (matches how callers actually set
ALGOLIA_AGENT_STUDIO_E2E=1).

Behavior fix discovered while writing dry-run.txtar:
`--compatibility` was validated AFTER the dry-run short-circuit
in `agents test` and `agents run`, so `--compatibility v9
--dry-run` silently produced a "valid-looking" preview for an
invalid request. Validation now runs first; same rationale as
rejecting --input + --message together at flag-parse time. The
unit test that asserted "invalid value rejected" had been
running in non-dry-run mode and thus already covered the new
behavior; no test churn needed.

Telemetry deliberately NOT added. The existing `Command Invoked`
event from `root.PersistentPreRunE` carries
`{command: cmd.CommandPath(), flags: [<changed flag names>]}`
which already attributes per verb (`algolia agents create`) and
per dry-run (flags contains "dry-run"). Adding bespoke
`agents.<verb>.{start,success,error,dry_run}` events would be
the first in-command telemetry calls in the entire codebase,
diverging from convention for one feature only. Outcome
(success/error) is a separate, all-commands refactor.

Docs generator (`go run ./cmd/docs --app_data-path <dir>`)
verified to serialize all 10 agents commands cleanly with the
expected frontmatter / examples / usage strings. Output is
consumed downstream by the docs site and is intentionally not
committed.

Live-CRUD testscript (create→get→update→delete chained on the
same id) deferred — it needs an id-extraction helper in the
testscript framework that we don't have yet. The shell smokes in
tmp/qa/ already exercise that path against staging.

Co-authored-by: Cursor <cursoragent@cursor.com>
User caught a real terminological collision in code review.

The brief said "tools to test and debug their Agent Studio setups
with --dry-run". I had read that as a literal flag spec on every
relevant command. Cross-checking the backend disproved that read:

  - Strings `dry_run`/`dryRun`/`dry-run` in
    `algolia/conversational-ai@development`: zero hits.
  - The closest backend primitive is `AgentTestConfiguration` +
    the `agent_id="test"` route — a real completion against an
    in-memory configuration, nothing persisted.

That route is what `agents test` (now renamed `agents try`) wraps.
The whole command IS the conversational-ai-side "dry-run" — running
unsaved config against the real backend. So `agents test --dry-run`
read literally as "dry-run a dry-run", which is the awkward case
the user pointed at.

Resolution:

1. Rename `agents test` → `agents try`. "Try" is single-word
   (preserves the CLI-wide convention — zero hyphenated subcommands
   exist anywhere in the tree), reads as "experiment without
   commitment", and avoids the pytest/unit-test connotations of
   "test" that don't apply here. `try` is registered in
   `agents.go` via the `trycmd` import alias to avoid clashing
   with Go's `try` keyword candidate / common variable name.

2. Drop the `--dry-run` flag (and its branch in `runTryCmd`) from
   the renamed command. There's nothing useful to "preview without
   HTTP" — the command is already side-effect-free. Dropped the
   corresponding unit test and replaced it with a comment block
   explaining why the test no longer exists.

3. Keep `--dry-run` on `create / update / delete / run`. That
   flag is the standard CLI preview convention (matches
   `objects/update --dry-run`); it lives on commands that DO have
   side effects (`create`/`update`/`delete` mutate state, `run`
   invokes a persisted agent which logs conversations + analytics
   + tokens). No collision with the conversational-ai concept on
   any of those.

4. Updated `e2e/testscripts/agents/dry-run.txtar`: removed the
   `agents test --dry-run` cases, added a regression assertion
   that `agents try --dry-run` is rejected as `unknown flag` so
   a future contributor doesn't re-add it without seeing this
   rationale. New `try.txtar` covers the four flag-validation
   guards on `agents try` (missing --config, missing --input/-m,
   --input/-m together, invalid --compatibility) — none of which
   need network, all fire before client construction.

5. AGENTS.md grew an "On `--dry-run`" section spelling out the
   two distinct concepts and why they must stay separate. Future
   contributors see this before touching the agents tree.

What I got wrong originally: I read the brief's "--dry-run"
literally instead of as shorthand for "the test/debug capability
the spec describes". Once `agents try` exists, the brief's intent
is satisfied by the command itself; layering a CLI-convention
`--dry-run` on top of it was duplicating the deliverable. The
unit tests I'd written for `agents test --dry-run` had been
passing because the dry-run branch was correctly implemented —
not because it was the right thing to implement. Pattern noted:
when a feature already satisfies an intent, double-check before
adding a flag that "completes" it.

Net diff: +28 / -378. The body-preview semantics are still
available everywhere they make sense; only the redundant
collision case is gone.

Co-authored-by: Cursor <cursoragent@cursor.com>
Closes the four completion query-param gaps and the cache-invalidate
endpoint flagged by the parity audit. Smallest delta of the planned
phases — chosen to land first because it validates the per-phase-PR
rhythm without introducing brand-new architectural surface.

What's in:

  - `agents try` and `agents run` learn `--no-cache`, `--no-memory`,
    `--no-analytics`, and `--secure-user-token <jwt>`. Maps onto the
    backend's documented completion query params and the
    X-Algolia-Secure-User-Token header.
  - New `agents cache invalidate <agent-id> [--before YYYY-MM-DD]`
    wraps `DELETE /1/agents/{id}/cache`. Mirrors `agents delete`'s
    confirmation contract: TTY prompts, non-TTY refuses without
    `--confirm`, `--dry-run` bypasses both since it's non-destructive.

Polarity calls (worth knowing for future contributors):

  - The CompletionOptions.No* fields are inverted from the wire on
    purpose. Backend defaults all three to true; only the negative is
    interesting to expose at the CLI surface, and `memory=true` would
    actually 422 (the schema is `anyOf [{const false}, {type null}]`).
    So the wire form omits the param when the No* field is false and
    sends `<param>=false` when true. Polarity is pinned end-to-end by
    a table-driven wire-mapping test in completions_test.go and
    smaller cmd-level guards in try/run.
  - Date validation on `--before` is deliberately the backend's job.
    Mirroring Pydantic's date parser in Go would create silent skew
    on minor backend bumps; the 422-detail surfacing already turns
    bad input into an actionable message verbatim.

Architecture: introduces the first nested cobra group under `agents`
(`agents cache <verb>`). Phase 6+ reuse this for providers /
conversations / keys / domains. Per-verb sub-package wasn't worth it
for one verb; if `cache stats` etc. land later, split then.

Refactor on second use deferred deliberately: 4 new flags duplicate
mechanically across try.go and run.go (8 lines per command). Could
be extracted into a `RegisterCompletionFlags` helper, but try and run
are exactly two consumers and the duplication is straight-line. If a
third consumer appears, extract following the rule we've used since
Phase 3 (`PrintDryRun`, `NormalizeCompatibility`).

Coverage:

  - `api/agentstudio/`: 1 new client method (`InvalidateAgentCache`),
    extended `CompletionOptions`, table-driven wire-mapping test
    covering all four flag combinations + the secure-user-token
    header, four new cache-endpoint tests (no-before / with-before /
    404→ErrNotFound / structured-422-detail surfaces verbatim).
  - `pkg/cmd/agents/cache/`: new package, 6 unit tests covering happy
    path, before-flag, dry-run with and without `--before`, missing
    agent-id, non-TTY-without-confirm, and 404 propagation.
  - `pkg/cmd/agents/try/` and `run/`: one targeted end-to-end test
    each, asserting cobra→opts→client wiring is intact for all four
    new flags (catches polarity transposition, missed forwarding).
  - e2e: new `cache.txtar` (5 contract checks); `dry-run.txtar`
    extended with a regression that the new `--no-*` flags are
    wire-only and don't leak into the dry-run body preview.

Live verification deferred. Staging test app still has no LLM
provider configured, so a green completion isn't possible — but
cache invalidation doesn't depend on a provider, and the unit + e2e
coverage is exhaustive for the wire shapes. A future Phase 6 commit
that lands provider CRUD will revive the live-completion smoke.

Strategy: per-phase PR from here on. This branch
(`lab_week_3_phase5`) is cut from `lab_week_3` HEAD; the PR will
target `cmarguta-alg:lab_week_3` so the diff is bounded to Phase 5
only. When PR algolia#212 merges upstream, this PR's base re-targets to
`algolia:main`.

Net diff: +390 / −21 across 10 files, 2 new files (cache package,
cache.txtar). 13/13 agents packages green, all 4 e2e testscripts
pass (try / cache / dry-run / list-skipped-as-gated), gofumpt
clean, golangci-lint clean.

Co-authored-by: Cursor <cursoragent@cursor.com>
cmarguta-alg and others added 9 commits May 6, 2026 23:52
Adds two new nested command groups under `algolia agents`, lifting the
CLI from 9/41 to 19/41 documented Agent Studio endpoints (~46%):

  agents providers <list|get|create|update|delete|models>
  agents config    <get|set>

API client (api/agentstudio/):
- providers.go: ListProviders/GetProvider/CreateProvider/UpdateProvider/
  DeleteProvider + ListProviderModels (catalog) + ListModelsForProvider
  (configured-provider). Provider.Input is json.RawMessage to mirror
  the 6-way discriminated input union (openai/azure_openai/google_genai/
  deepseek/openai_compatible/anthropic) — same rationale as Agent.Config.
- configuration.go: GetConfiguration/UpdateConfiguration. Body kept as
  json.RawMessage for forward-compat even though today's schema is a
  single int (maxRetentionDays).

CLI (pkg/cmd/agents/{providers,config}/):
- providers.go: 6 verbs in one file extending the cache-group precedent
  (rather than per-verb subpackages). `models` is one command with two
  routes — no flag hits the static catalog, --provider-id hits the
  configured-provider listing.
- mask.go: redacts `apiKey` to "***" by default in providers list/get/
  create/update success output. --show-secret opts out for scripted
  exports. Sets the convention for Phase 8 (`agents keys`).
- config.go: `set` accepts either `--retention-days N` or `-F file.json`
  (mutually exclusive). File path is forward-compat for future fields.

Docs + tests:
- AGENTS.md: updated sub-group listing, provider client conventions,
  and a new "Provider secret masking" section explaining the redaction
  contract.
- 27 client tests + 16 cmd tests + 2 e2e testscripts (providers.txtar,
  config.txtar) covering flag validation, dry-run output shape, masking
  on/off, and route picking.
- Live-verified against staging EU: config get returns the live
  retention default, providers list returns proper paginated shape on
  empty, models catalog returns ~50 entries across 4+ provider types.

Co-authored-by: Cursor <cursoragent@cursor.com>
Refactor only — no behaviour changes. Both the API client layer and the
providers command package now mirror the OpenAPI spec's tag structure
(one source file per tag), which makes it easier to add a new endpoint
or verb later: a single new file plus its sibling test file.

api/agentstudio/
  client.go      568 LOC monolith → 193 LOC infra-only
                 (Config/Client/NewClient/setHeaders/checkResponse/
                  extractDetail/sentinelFor)
  agents.go      NEW — Agents tag (List/Get/Create/Update/Delete/
                 Publish/Unpublish/Duplicate/InvalidateAgentCache +
                 doAgentMutation/doAgentLifecycle helpers)
  completions.go NEW — Completions tag (Completions + boolToWire)

  agents_test.go (renamed from mutations_test.go) absorbs GetAgent_*
                 from get_agent_test.go (deleted) and the ListAgents
                 happy-path tests previously in client_test.go.
  client_test.go now holds only the cross-cutting infra tests
                 (NewClient validation, checkResponse error mapping,
                 ctx cancellation, header omission). The error-mapping
                 suite still uses ListAgents as a vehicle since it's
                 the simplest GET shape; the file header explains why.

pkg/cmd/agents/providers/
  providers.go   738 LOC → 93 LOC (parent + readBody/ctxOrBackground/
                 relTimeOrDash helpers)
  list.go, get.go, create.go, update.go, delete.go, models.go
                 NEW — one file per verb, same package so MaskInput
                 and the helpers stay un-exported

  providers_test.go shrinks to the test harness (newClientForServer +
                 writeTempJSON); each verb gets a sibling _test.go
                 with its own scenarios.

AGENTS.md gains a "File organisation" section documenting both layers'
tag-aligned conventions and the explicit decision NOT to promote
sub-group verbs to their own subpackages.

All gates green: gofumpt, go vet, go test ./..., e2e TestAgents/{cache,
try,dry-run,config,providers}, task build, golangci-lint on touched
dirs.

Co-authored-by: Cursor <cursoragent@cursor.com>
Adds 5 endpoints under `agents conversations <verb>`: list, get, delete,
purge, export. Brings CLI parity to 24/41 documented Agent Studio
endpoints (~59%, ~71% of publicly visible).

Client (api/agentstudio/conversations.go):
  - ListConversations  → typed PaginatedConversationsResponse
  - GetConversation    → json.RawMessage (full shape has discriminated
                         message-role union; passthrough is honest)
  - DeleteConversation → 204 sentinel-mapped
  - PurgeConversations → 204 (DELETE on the list endpoint)
  - ExportConversations → json.RawMessage (spec leaves shape unspecified)

  ListConversationsParams.FeedbackVote is *int because nil = no filter
  while 0 (downvote) is a meaningful value. Regression-tested.

Commands (pkg/cmd/agents/conversations/, per-verb files following the
providers/ pattern from the previous refactor):
  - list:   table view + --output json passthrough; supports
            --start-date/--end-date/--include-feedback/--feedback-vote.
            --feedback-vote requires --include-feedback (backend
            constraint, rejected at CLI for a clearer error).
  - get:    raw JSON passthrough (pretty-printed by default,
            structured via --output json), --include-feedback toggle.
  - delete: single-record; mirrors `agents delete` confirm contract
            (--confirm required non-TTY, --dry-run previews).
  - purge:  bulk; two extra guardrails on top of the confirm contract:
            (a) dateless purge requires explicit --all (otherwise a
                typo on --start-date would silently turn a range purge
                into a wipe-all);
            (b) --all is mutually exclusive with --start-date/--end-date.
  - export: --output-file writes raw bytes (jq-friendly), stdout
            pretty-prints; --start-date/--end-date scope.

  All verbs take agent-id as the first positional argument, matching
  `agents publish/run/cache invalidate`.

Tests:
  - 14 client unit tests (incl. *int FeedbackVote zero-value regression,
    feature-disabled propagation, ID validation, dateless-purge wire
    shape).
  - 16 cmd-level tests (incl. dateless-purge guardrail rejection,
    --all + date mutex rejection, feedback-vote-without-include
    rejection, dry-run URL+scope shapes).
  - e2e/testscripts/agents/conversations.txtar (no registration needed
    — TestAgents glob-runs everything in testscripts/agents/).

AGENTS.md gains a "Conversations: purge vs delete" section documenting
both guardrails and why they live at the cmd layer (the wire client
mirrors the spec).

Gates green: gofumpt, go vet, go test ./... agent dirs (16/16 packages),
e2e TestAgents/{cache,try,dry-run,config,providers,conversations},
golangci-lint, task build.

Co-authored-by: Cursor <cursoragent@cursor.com>
Spec-vs-reality mismatch caught by Anya's Phase 7 live vet against
staging EU.

OpenAPI declares both `startDate` and `endDate` on
`DELETE /1/agents/{id}/conversations` as `required: false`, which
reads as "dateless purge wipes everything." The live backend
disagrees: dateless DELETE returns
`400 "At least one filter is required"`.

The previously-shipped `--all` flag was therefore non-functional —
every invocation produced a request the backend always rejected.

Changes:
  - Remove `--all` (and the dead `PurgeOptions.All` field).
  - Require at least one of `--start-date`/`--end-date` at the CLI;
    error message names the backend constraint so future spec drift
    stays debuggable.
  - Document the open-ended-bound workaround (--start-date 1970-01-01)
    in help text for genuine wipe-all needs. Verified live (exit 0).
  - Tests realigned: drop the obsolete --all/dateless-without-all
    cases, add Test_runPurgeCmd_RefusesDateless and
    Test_runPurgeCmd_AcceptsOpenEndedStart.
  - e2e/testscripts/agents/conversations.txtar updated to match.
  - AGENTS.md gains an explicit "Spec vs reality on purge" callout
    with the failing wire trace from the vet.

Lesson noted in tmp/lab_week_3_plan.md and tmp/qa/anya_phase7_vet/
REPORT.md: trust live curl over the FastAPI OpenAPI spec, which
reflects the function signature rather than the validation logic.

Gates green post-fix: gofumpt, go vet, conversations unit tests,
TestAgents/conversations e2e, golangci-lint on agents tree,
task build.

Co-authored-by: Cursor <cursoragent@cursor.com>
Adds 6 endpoints under `agents domains <verb>`: list, get, create,
delete, bulk-insert, bulk-delete. Per-agent CORS allowlist; agent ID
positional first arg as in conversations/cache.

Client (api/agentstudio/domains.go):
  - Typed AllowedDomain + AllowedDomainListResponse (no opaque JSON,
    no pagination — list returns {domains:[]} flat).
  - Bulk insert wants {domains:[]string}, bulk delete wants
    {domainIds:[]string} as a request body (DELETE with body).

Commands (pkg/cmd/agents/domains/):
  - list, get, create (--domain), delete (-y/--dry-run),
    bulk-insert (--domain repeated or -F file), bulk-delete
    (--domain-id repeated or -F file). Bulk verbs accept a JSON array
    of strings via -F for scripted use; --domain* and -F are mutually
    exclusive.
  - Live-probed against staging EU before coding; wire shapes match
    the spec exactly.

Gates: gofumpt, vet, unit, e2e (TestAgents/domains), lint, build.
Co-authored-by: Cursor <cursoragent@cursor.com>
… 8b)

5 endpoints under `agents keys <verb>`: list, get, create, update,
delete. Per-app scope, optional per-agent allowlist via agentIds.

Mutating verbs require an admin API key. Live-probed against staging
EU: GET /1/secret-keys returns 200 with empty list, all mutating
verbs return 403 {"message":"Admin API key required."} with a
non-admin key (extractDetail already maps {message} envelopes — gives
ErrForbidden + clear detail). Documented in command long-help.

MaskInput moved from pkg/cmd/agents/providers/ to pkg/cmd/agents/shared/
since two consumers now redact secrets (provider input.apiKey,
secret-key Value). Adds MaskString for top-level scalar secrets.

Update semantics:
  - --name and --agent-id are independently set; nil pointer = leave
    field unchanged in the PATCH body (matches OpenAPI's anyOf-null).
  - Repeated --agent-id replaces (not merges) the allowlist.
  - --agent-id="" clears the allowlist (empty strings filtered out;
    if all values are empty the resulting [] still gets sent through).

Output safety:
  - SecretKey.Value is redacted as "***" by default in list/get/
    create/update output.
  - --show-secret opts in to raw value (e.g. for piping into a vault).

Gates: gofumpt, vet, unit (api+all agents pkgs), e2e (TestAgents/keys),
lint, build.

Co-authored-by: Cursor <cursoragent@cursor.com>
Two single-tag sub-groups:

  agents feedback create     POST /1/feedback
  agents user-data get       GET    /1/user-data/{user_token}
  agents user-data delete    DELETE /1/user-data/{user_token}

Live-probed against staging EU. Wire shapes match the spec; the
feedback endpoint returns 404 with {message:...} envelope for unknown
message IDs (already mapped via extractDetail).

Notes:
  - feedback validates locally: vote ∈ {0,1}, tags ≤10 each ≤50
    chars, notes ≤1000 chars (matches OpenAPI bounds; saves a
    round-trip on common mistakes).
  - user-data is the GDPR right-to-access (get) and right-to-be-
    forgotten (delete) endpoints; delete is irreversible and erases
    across every agent in the app, so the prompt is explicit and
    --confirm is enforced in non-interactive shells.
  - GetUserData returns the full envelope with []json.RawMessage
    inner items (conversations have a discriminated message-role
    union, memories have an unspecified shape — same convention as
    GetConversation in Phase 7).

Gates: gofumpt, vet, unit (api+all agents pkgs), e2e (TestAgents/
feedback + TestAgents/userdata), lint, build.

Co-authored-by: Cursor <cursoragent@cursor.com>
Closes 100% of the Agent Studio OpenAPI surface (41/41 endpoints).

New top-level group (parent hidden, children visible):
  agents internal status         GET  /status
  agents internal memorize       POST /1/agents/agents/{id}/memorize
  agents internal ponder         POST /1/agents/agents/{id}/ponder
  agents internal consolidate    POST /1/agents/agents/{id}/consolidate

New providers verb:
  agents providers defaults      GET  /1/providers/models/defaults

Path quirk verified live: memorize/ponder/consolidate live under the
doubled prefix /1/agents/agents/ (the spec is correct, not a typo).
The single-prefix equivalents 404 with FastAPI's standard {detail:
"Not Found"} envelope; the doubled prefix routes to the actual
handler. Pinned in the godoc on api/agentstudio/internal.go.

Hidden semantics: parent has Hidden:true so it doesn't pollute
`agents --help`, but children are visible — running
`agents internal --help` still surfaces what's available to anyone
who knows to look. Pinned by a unit test.

Memory verbs (memorize/ponder/consolidate) take the agent ID
positional and a JSON body via --body or -F ("-" for stdin),
mirroring the agents try input contract. Body shapes are evolving
(messages is a v4/v5 discriminated role union per the spec) so the
client passes the body through as-is and returns the response as
json.RawMessage — no premature typing.

defaults: small read returning {provider_type → recommended_model}.
Sibling of `agents providers models` — different question, different
shape, didn't fit as a flag on the existing models verb.

Gates: gofumpt, vet, unit (api+all agents pkgs), e2e (TestAgents/*),
lint, build.

Co-authored-by: Cursor <cursoragent@cursor.com>
…ya F-1)

Anya's Phase 8 vet caught an asymmetric 404 from the staging gateway:
`agents user-data get` and `delete` against any token containing
"/" return `agent studio: 404 Not Found: Not Found` while every
other unknown token (including ones with +, ?, #, &, =) returns a
clean 200 empty / 204.

Root cause is server-side — the gateway decodes %2F before routing
so the path no longer matches /1/user-data/{token}. The CLI does the
right thing with url.PathEscape; the symptom is a misleading 404
that an operator processing a GDPR deletion request would reasonably
read as "data was never stored."

Fix: refuse "/"-bearing tokens at the CLI layer with a precise
error message that names the gateway behavior, on both verbs. One
shared const, two two-line guards. No behavior change for tokens
without "/".

Live-verified: `./algolia agents user-data get "ab/cd"` now
prints the explicit message instead of "404 Not Found."

Gates: gofumpt, vet, unit (api+all agents pkgs), e2e
(TestAgents/userdata), lint, build.

Co-authored-by: Cursor <cursoragent@cursor.com>
@cmarguta-alg cmarguta-alg changed the title feat: add agents commands for Algolia Agent Studio feat(agents): add Algolia Agent Studio command tree (full API surface) May 7, 2026
cmarguta-alg and others added 6 commits May 7, 2026 13:03
Three small follow-ups to PR algolia#212.

* `agents try` / `agents run`: TTY-attached stdout now renders a
  human-readable transcript (text-deltas inline, tool calls/results
  as dim annotations, errors red) instead of NDJSON. Non-TTY paths
  keep emitting NDJSON unchanged so existing pipelines (jq, log
  capture, scripts) are not broken. New `--ndjson` flag forces NDJSON
  on a TTY for users who want machine output on screen.

* `agents domains bulk-delete`: now pre-fetches the existing list
  (TTY-only, one extra GET) so the success line can split requested
  IDs into removed-vs-already-absent. Non-TTY behavior unchanged.
  Backend returns 204 with no body, so this is the only signal we can
  give users about whether their delete actually did anything.

* `agents keys list`: lock the table-render mask path with two new
  tests (default masks, --show-secret reveals). Existing tests only
  covered --output json — the table writes k.Value verbatim, so the
  mask must apply before the row is added.

Co-authored-by: Cursor <cursoragent@cursor.com>
…ents

Pulls the long-form Agent Studio rationale (file layout, auth, host
resolution, streaming protocols, dry-run semantics, secret masking,
backend gotchas) out of multi-paragraph in-source comments and
out of AGENTS.md, into a single reference at docs/agents.md.

Source files now keep one-line godoc comments only; gotchas that
genuinely have to live next to the code (purge requires a date,
slash in user-tokens, doubled memory-op path, memory polarity,
ACL on configuration) are kept as 1-2 line hints with a pointer
to docs/agents.md.

AGENTS.md drops the agent-studio sections and links to the doc.

Net: -945 / +219 lines across 25 files, no behavior change.
Docs generator + full test sweep + gofumpt all clean.

Co-authored-by: Cursor <cursoragent@cursor.com>
Five low-risk DRY passes across `pkg/cmd/agents/`:

- `newClientForServer` × 14 test files → `sharedtest.NewClient`
  (new `pkg/cmd/agents/sharedtest/` package, kept out of production
  imports). `writeTempJSON` × 6 → `sharedtest.WriteTempJSON`.
- `ctxOrBackground` × 7 sub-group files → `shared.OrBackground` and
  inline `if ctx == nil { ctx = context.Background() }` blocks in
  delete/cache/try/run.
- `--confirm` / "non-TTY refuses without -y" pattern × 9 commands
  → `shared.AddConfirmFlag` + `shared.ResolveConfirm` +
  `shared.Confirm`.
- File-reading helpers (`readBody`, `readJSONBody`, inline
  `cmdutil.ReadFile + TrimUTF8BOM + json.Valid` in
  config/update/create/domains) → `shared.ReadJSONFile`.
- Completion-flag block (8 flags + the input/message mutex)
  duplicated in `try` and `run` → `shared.CompletionInputs` +
  `shared.RegisterCompletionFlags`.

Net: 102 files changed, +587 / -770. Behavior unchanged; full suite
green, golangci-lint clean.

Co-authored-by: Cursor <cursoragent@cursor.com>
Allow common provider flows without -F: --name/--provider plus
--api-key, --api-key-stdin, or --api-key-env for create (openai,
anthropic, google_genai, deepseek); patches via --name, API key flags,
and optional --base-url for update. Azure and openai_compatible still
require JSON files.

Keeps file and shortcut paths mutually exclusive. Updates agents e2e
providers.txtar for the relaxed -F requirement.

Co-authored-by: Cursor <cursoragent@cursor.com>
Adds docs/agents.md section on providers -F vs flags; links to
docs/qa/arg_friendly_providers_SIGNOFF.md (Lab Week planning-team
alignment + ordered test commands).

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
cmarguta-alg and others added 2 commits May 7, 2026 15:25
Co-authored-by: Cursor <cursoragent@cursor.com>
tmp is cleaned by Taskfile and may contain local QA Go packages outside
review; use issues.exclude-dirs instead of deprecated run.skip-dirs.

Co-authored-by: Cursor <cursoragent@cursor.com>
Reuse build ldflags (including Agent Studio URL from .env); output binary
algolia-beta and default VERSION main+beta. Document in AGENTS and README.

Co-authored-by: Cursor <cursoragent@cursor.com>
cmarguta-alg and others added 4 commits May 7, 2026 17:26
Inject version.Distribution via task build-beta ldflags (empty for releases).
PersistentPreRunE + help wrapper on agents group warns on stderr; defaults
stay on stdout pipelines. Docs + DIST template in Taskfile.

Co-authored-by: Cursor <cursoragent@cursor.com>
Remove docs/qa sign-off sheet; trim agents.md link. Strip tmp/qa,
named QA, and Phase N callouts from tests and e2e scripts while
keeping technical rationale.

Co-authored-by: Cursor <cursoragent@cursor.com>
Strip the experimental --dry-run flag and preview branches from every
agents subcommand. ResolveConfirm now only reflects TTY versus --confirm.
Remove shared PrintDryRun and fold body helpers down to SourceLabel and
TrimUTF8BOM.

Refresh docs/agents.md, AGENTS.md, and README wording for beta builds.
Rewrite e2e testscripts/agents cases and drop agents dry-run smoke.
Align try help/comments and unit tests.

Co-authored-by: Cursor <cursoragent@cursor.com>
@cmarguta-alg cmarguta-alg marked this pull request as ready for review May 8, 2026 14:00
@NatanTechofNY
Copy link
Copy Markdown
Collaborator

🤖 AI-Generated Code Review by Claude


PR #212 Code Review

Overview

Large new feature PR (+13,183 lines) adding the complete algolia agents command tree backed by a new api/agentstudio typed client.

Overall: Well-structured addition with good test coverage, sensible guardrails (secret masking, bounded purge, path-escaping), and clear documentation. A few security and reliability concerns in the transport layer and URL-construction code warrant attention before shipping.


🚨 HIGH ISSUES

1. --api-key Flag Persists LLM Provider Key to Shell History

  • File: pkg/cmd/agents/providers/create.go (+ update.go)
  • Issue: The --api-key <key> flag accepts a raw plaintext secret on the command line. Any value passed this way lands in shell history (~/.zsh_history, ~/.bash_history), is visible in /proc/<pid>/cmdline on Linux for the lifetime of the process, and appears in ps aux output.
  • Risk: Read access to the user's home directory or process table is sufficient to harvest the key. Safe alternatives (--api-key-stdin, --api-key-env) already exist in this PR.
  • Required Actions:
    • Consider dropping --api-key entirely and only keeping --api-key-stdin / --api-key-env (same decision made by many CLIs that handle credentials — e.g. gh, docker login).
    • At minimum, emit a deprecation/security warning to stderr whenever --api-key is used directly.

2. URL Override Accepts http:// — API Keys Sent Unencrypted, SSRF Risk

  • File: api/agentstudio/host.goResolveHost
  • Issue: The Override path returns the value verbatim after TrimRight. No scheme validation is performed. A misconfigured profile or injected env var can point ALGOLIA_AGENT_STUDIO_URL at http://attacker.example.com, causing every request (including those carrying X-Algolia-API-Key) to be sent unencrypted. Locally, http://169.254.169.254/ and similar IMDS endpoints are also reachable.
  • Risk: Credential exfiltration over plaintext HTTP; internal network probing (SSRF) on cloud VMs.
  • Required Actions:
    • Reject overrides that don't use https://. Return an error: if !strings.HasPrefix(host, "https://") { return "", fmt.Errorf("agent_studio_url must use https, got %q", host) }.
    • If http:// needs to stay for local dev, gate it behind an explicit --insecure flag with a loud stderr warning.

3. ApplicationID Embedded Unsanitised in Cluster-Proxy URL

  • File: api/agentstudio/host.go — cluster-proxy fallback
  • Issue: The fallback path is "https://" + appID + ".algolia.net/agent-studio". appID is only whitespace-trimmed — no format validation. A value containing # (e.g. evil.com#) makes Go's URL parser treat .algolia.net/agent-studio as a fragment, connecting to evil.com instead. Similarly a value with / or ? redirects the request while leaving .algolia.net in the path/query.
  • Risk: If ApplicationID can be set from an untrusted config source or env var injection, all API calls (including credentials) are silently sent to the injected host.
  • Required Actions:
    • Validate ApplicationID against the expected format before URL construction: regexp.MustCompile("^[A-Z0-9]{8,12}$").MatchString(appID).
    • Or construct the URL with url.URL{} fields so special characters are rejected/encoded and the resulting host is always <appID>.algolia.net.

4. Streaming Completions Have No Idle Read Deadline

  • File: api/agentstudio/completions.go + api/agentstudio/sse.go
  • Issue: Completions returns the raw *http.Response for the caller to stream. If the caller provides context.Background() or a context without a deadline, bufio.Scanner.Scan() inside ParseStream will block forever when the server opens a connection, sends a few events, then stalls (network partition, upstream crash, proxy mis-configuration). http.DefaultClient has no fallback timeout either.
  • Risk: CLI hangs indefinitely — only killable with SIGKILL. Particularly bad in unattended CI/CD pipelines.
  • Required Actions:
    • Document (in godoc + docs/agents.md) that callers must supply a context with a deadline when calling Completions.
    • Consider wrapping resp.Body in a deadline-rearming reader, or use a custom http.Transport with ResponseHeaderTimeout + IdleConnTimeout set.

⚡ PERFORMANCE / RELIABILITY CONCERNS

5. http.DefaultClient Used as Fallback — No Timeout, Shared Global State

  • File: api/agentstudio/client.goNewClient
  • Issue: if cfg.HTTPClient == nil { cfg.HTTPClient = http.DefaultClient }. http.DefaultClient.Timeout is zero (no timeout). It's also a package-level global — any other package that mutates it will silently affect agentstudio requests.
  • Risk: Non-streaming requests (list, get, create, etc.) can hang indefinitely on slow servers. Shared state creates fragile coupling.
  • Fix: Always construct a private *http.Client with an explicit Timeout (e.g. 30 s) for non-streaming calls. For streaming, use a custom http.Transport with dial/TLS/header timeouts only, leaving Client.Timeout at zero.

6. SSE Scanner Buffer Allows 5 MB Per Line; Response Body Ownership is Implicit

  • File: api/agentstudio/sse.go + api/agentstudio/completions.go
  • Issue: const maxLine = 1024 * 5120 (5 MB). A single large SSE frame causes the scanner's backing buffer to grow to 5 MB and stay there for the full stream duration (bufio.Scanner does not shrink). Combined with Completions returning a raw *http.Response, a caller that abandons streaming mid-way without calling Close() leaks the TCP connection until GC finalises the response.
  • Risk: Memory pressure on long streams with large tool-result blobs; connection pool exhaustion in loop scripts.
  • Fix: Lower maxLine to a realistic cap (e.g. 512 KB) if 5 MB payloads aren't expected. Add a clear godoc note on Completions that the caller must Close() the response body.

🛠 BEST-PRACTICES NOTES

7. Silent SSE Frame Skipping Produces No Observable Signal

  • File: api/agentstudio/sse.goParseStream
  • Issue: Unparseable or unknown-type frames are silently dropped (if !ok { continue }). The rationale ("backends evolve faster than parsers") is sound for unknown future event types, but makes it impossible to distinguish "stream contained 0 useful frames" from "every frame was malformed due to a backend regression."
  • Fix: Accept an optional onUnknown func(raw string) callback, or count skipped frames and surface the count via a verbose/debug flag. A --debug path writing skipped raw lines to stderr would catch silent protocol mismatches immediately.

8. Agent E2E Tests Gated Behind ALGOLIA_AGENT_STUDIO_E2E — Will Bit-Rot in CI

  • File: e2e/e2e_test.go + e2e/testscripts/agents/
  • Issue: The live-agent e2e tests require a dedicated ALGOLIA_AGENT_STUDIO_E2E=1 env var separate from the ALGOLIA_APPLICATION_ID gate used by every other e2e test in the repo. These tests are almost certainly not wired into any CI job and will accumulate drift against the real API.
  • Fix: Follow the existing e2e pattern — gate on ALGOLIA_APPLICATION_ID being set (or a single shared flag). Document in CONTRIBUTING how to run agent e2e tests and add a CI step or release checklist item.

9. User Tokens Containing / Should Be Hard-Rejected in DeleteUserData

  • File: api/agentstudio/userdata.go
  • Issue: The docs note that user tokens with / produce an advisory exit, but the guard is documentation-only — url.PathEscape encodes / as %2F, which some reverse proxies (nginx merge_slashes, AWS ALB) silently decode before routing, treating it as a path component. For a destructive endpoint, a token like ../agents/some-agent could resolve to an unintended resource under a permissive gateway.
  • Fix: Add a hard rejection in both GetUserData and DeleteUserData: if strings.Contains(userToken, "/") { return fmt.Errorf("user token must not contain '/'") }. Don't rely on documentation for a destructive path.

✅ STRENGTHS

  • Secret masking: Provider and key secrets masked by default with --show-secret opt-in — well-implemented and tested across both TTY and JSON output paths.
  • Path escaping: url.PathEscape consistently applied to all ID parameters with explicit tests for edge-case IDs.
  • Bounded purge: Date-range requirement on conversation purge caught at flag-parse time with a clear test.
  • SSE protocol normalisation: The v4/v5 compatibility shim is clean and well-tested with explicit regression cases for query-parameter polarity.
  • Error sentinel hierarchy: errors.Is(err, ErrNotFound) chains via APIError.Unwrap() work cleanly and are tested end-to-end.

📋 BEFORE MERGE

Must Have:

  1. Validate URL override scheme (https:// only) — host.go
  2. Validate/sanitise ApplicationID format before cluster-proxy URL construction — host.go

Should Have:
3. Remove or warn on --api-key plaintext flag — providers/create.go, providers/update.go
4. Add read deadline or document context requirement for streaming completions — completions.go
5. Hard-reject / in user tokens in DeleteUserDatauserdata.go
6. Align agent e2e tests with existing CI gating pattern — e2e/


🎯 RECOMMENDATION

Conditional Approval — Strong foundation with excellent test coverage and good UX guardrails. The URL-construction issues in host.go (items 2 & 3) are the only ones that represent a genuine pre-ship security risk given that ALGOLIA_AGENT_STUDIO_URL / agent_studio_url are user-configurable. Everything else is hardening and polish.

Scores:

  • Code Quality: 8/10
  • Security: 6/10
  • Performance: 7/10
  • Multi-Tenancy: 8/10

🤖 Generated by Claude — Automated review focusing on security, performance, and multi-tenancy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants