feat(agents): add Algolia Agent Studio command tree (full API surface) by cmarguta-alg · Pull Request #212 · algolia/cli

cmarguta-alg · 2026-05-06T18:44:23Z

🤔 What

Adds algolia agents backed by api/agentstudio: agent lifecycle (list, get, create, update, delete, publish, unpublish, duplicate), completions (run on a persisted id, try on a local --config file hitting /1/agents/test/completions), plus conversations, domains, providers, keys, feedback, user-data (alias userdata), cache, config, and hidden internal. Destructive or sensitive paths require --confirm / -y on non-TTY, same pattern as elsewhere in the CLI.

🤷 Why

Agent Studio workflows are cumbersome to reproduce with one-off HTTP calls or dashboard-only flows; exposing them in the Algolia CLI makes automation, CI, and support playbooks repeatable.

🛠 How

Area	Change
`api/agentstudio`	Typed client modules per domain; shared transport; sentinel-style error envelopes; SSE parsing aligned with AI SDK v4/v5 streaming.
*`pkg/cmd/agents/`**	One package area per backend concern; default masking on provider/key secrets (`--show-secret` to print raw values); guardrails spelled out in `docs/agents.md` (conversation purge bounds, `/` in user tokens, memory route prefix).
Factory / config / build	`AgentStudioClient` on `cmdutil.Factory`; `agent_studio_url` plus `ALGOLIA_AGENT_STUDIO_URL` (see `docs/agents.md` / root `agents --help` for precedence).
E2E	*`e2e/testscripts/agents/.txtar` for offline validation paths; gated live `agents list` when `ALGOLIA_AGENT_STUDIO_E2E=1`**.

Full algolia agents leaf commands — Cobra UseLine + command-scoped flags

Every command also inherits the standard algolia connection flags (--application-id, --api-key for the Algolia Search credential, --profile, --search-hosts). Those globals are omitted from each listing unless this command adds a different --api-key (the provider shortcuts on agents providers create|update).

### algolia agents cache invalidate <agent-id> [--before YYYY-MM-DD] [--confirm] [flags]
- --before, --confirm (-y)

### algolia agents config get [flags]
- --allow-missing-template-keys, --output (-o), --template

### algolia agents config set [--retention-days N] [-F file] [flags]
- --allow-missing-template-keys, --file (-F), --output (-o), --retention-days, --template

### algolia agents conversations delete <agent-id> <conversation-id> [--confirm] [flags]
- --confirm (-y)

### algolia agents conversations export <agent-id> [--start-date YYYY-MM-DD] [--end-date YYYY-MM-DD] [--output-file path] [flags]
- --allow-missing-template-keys, --end-date, --output-file (-O), --output (-o), --start-date, --template

### algolia agents conversations get <agent-id> <conversation-id> [flags]
- --allow-missing-template-keys, --include-feedback, --output (-o), --template

### algolia agents conversations list <agent-id> [flags]
- --allow-missing-template-keys, --end-date, --feedback-vote, --include-feedback, --output (-o), --page, --per-page, --start-date, --template

### algolia agents conversations purge <agent-id> (--start-date YYYY-MM-DD | --end-date YYYY-MM-DD) [--confirm] [flags]
- --confirm (-y), --end-date, --start-date

### algolia agents create -F <file> [flags]
- --allow-missing-template-keys, --file (-F), --output (-o), --template

### algolia agents delete <agent-id> [--confirm] [flags]
- --allow-missing-template-keys, --confirm (-y), --output (-o), --template

### algolia agents domains bulk-delete <agent-id> [--domain-id x ...] [-F file] [--confirm] [flags]
- --confirm (-y), --domain-id, --file (-F)

### algolia agents domains bulk-insert <agent-id> [--domain x ...] [-F file] [flags]
- --allow-missing-template-keys, --domain, --file (-F), --output (-o), --template

### algolia agents domains create <agent-id> --domain <pattern> [flags]
- --allow-missing-template-keys, --domain, --output (-o), --template

### algolia agents domains delete <agent-id> <domain-id> [--confirm] [flags]
- --confirm (-y)

### algolia agents domains get <agent-id> <domain-id> [flags]
- --allow-missing-template-keys, --output (-o), --template

### algolia agents domains list <agent-id> [flags]
- --allow-missing-template-keys, --output (-o), --template

### algolia agents duplicate <agent-id> [flags]
- --allow-missing-template-keys, --output (-o), --template

### algolia agents feedback create --agent-id <id> --message-id <id> --vote 0|1 [--tags x,y] [--notes ...] [flags]
- --agent-id, --allow-missing-template-keys, --message-id, --notes, --output (-o), --tags, --template, --vote

### algolia agents get <agent-id> [flags]
- --allow-missing-template-keys, --output (-o), --template

### algolia agents keys create --name <name> [--agent-id <id> ...] [flags]
- --agent-id, --allow-missing-template-keys, --name, --output (-o), --show-secret, --template

### algolia agents keys delete <id> [--confirm] [flags]
- --confirm (-y)

### algolia agents keys get <id> [flags]
- --allow-missing-template-keys, --output (-o), --show-secret, --template

### algolia agents keys list [flags]
- --allow-missing-template-keys, --limit, --output (-o), --page, --show-secret, --template

### algolia agents keys update <id> [--name <name>] [--agent-id <id> ...] [flags]
- --agent-id, --allow-missing-template-keys, --name, --output (-o), --show-secret, --template

### algolia agents list [flags]
- --allow-missing-template-keys, --output (-o), --page, --per-page, --provider-id, --template

### algolia agents providers create (-F <file> | --name <name> --provider <type> (--api-key <key> | --api-key-stdin | --api-key-env <var>)) [flags]
- --allow-missing-template-keys, --api-key-env, --api-key-stdin, --api-key, --base-url, --file (-F), --name, --output (-o), --provider, --show-secret, --template

### algolia agents providers defaults [flags]
- --allow-missing-template-keys, --output (-o), --template

### algolia agents providers delete <provider-id> [--confirm] [flags]
- --confirm (-y)

### algolia agents providers get <provider-id> [flags]
- --allow-missing-template-keys, --output (-o), --show-secret, --template

### algolia agents providers list [flags]
- --allow-missing-template-keys, --output (-o), --page, --per-page, --show-secret, --template

### algolia agents providers models [--provider-id <id>] [flags]
- --allow-missing-template-keys, --output (-o), --provider-id, --template

### algolia agents providers update <provider-id> (-F <file> | [--name <name>] [--api-key <key> | --api-key-stdin | --api-key-env <var>] [--base-url <url>]) [flags]
- --allow-missing-template-keys, --api-key-env, --api-key-stdin, --api-key, --base-url, --file (-F), --name, --output (-o), --show-secret, --template

### algolia agents publish <agent-id> [flags]
- --allow-missing-template-keys, --output (-o), --template

### algolia agents run <agent-id> [--input <file> | --message <text>] [flags]
- --compatibility, --input (-i), --message (-m), --ndjson, --no-analytics, --no-cache, --no-memory, --no-stream, --secure-user-token

### algolia agents try --config <file> [--input <file> | --message <text>] [flags]
- --compatibility, --config (-c), --input (-i), --message (-m), --ndjson, --no-analytics, --no-cache, --no-memory, --no-stream, --secure-user-token

### algolia agents unpublish <agent-id> [flags]
- --allow-missing-template-keys, --output (-o), --template

### algolia agents update <agent-id> -F <file> [flags]
- --allow-missing-template-keys, --file (-F), --output (-o), --template

### algolia agents user-data delete <user-token> [--confirm] [flags]
- --confirm (-y)

### algolia agents user-data get <user-token> [flags]
- --output-file (-o)

### Hidden agents internal (not included in describe agents; unstable)

The internal group is omitted from algolia describe agents because the parent command is Hidden.

#### algolia agents internal status [flags]
- Structured output bundle: --output (-o), --template, --allow-missing-template-keys (same pattern as other read verbs emitting JSON).

#### algolia agents internal memorize <agent-id> [--body <json> | -F file] [flags]
#### algolia agents internal ponder <agent-id> [--body <json> | -F file] [flags]
#### algolia agents internal consolidate <agent-id> [--body <json> | -F file] [flags]
- --body, --file (-F), plus the structured output bundle above.

🧪 Testing

Quick checks

task test
task lint
task build

Packages

go test ./api/agentstudio/... -count=1
go test ./pkg/cmd/agents/... -count=1

E2E (TestAgents, offline scripts; synthetic app credentials suffice)

ALGOLIA_APPLICATION_ID=test ALGOLIA_API_KEY=test \
  go test ./e2e -tags=e2e -run '^TestAgents$' -count=1

E2E (add live agents list envelope check)

ALGOLIA_APPLICATION_ID=<app-id> ALGOLIA_API_KEY=<api-key> ALGOLIA_AGENT_STUDIO_E2E=1 \
  go test ./e2e -tags=e2e -run '^TestAgents$' -count=1

Action / expectation

When algolia agents conversations purge "<agent-id>" -y runs in a non-interactive shell without --start-date or --end-date, then the process exits non-zero before a purge request leaves the CLI (bounded delete is mandatory).
When algolia agents user-data get 'ab/cd' (token contains /), then the CLI exits non-zero with the gateway-routing advisory (avoiding a bogus 404 from premature decoding).

Optional staging / manual probes (substitute real ids and JSON paths)

./algolia agents list --output json
./algolia agents create -F agent.json
./algolia agents try -c draft.json -m "hi" --compatibility v5 --no-stream
./algolia agents run <agent-id> -m "Hello" --no-stream
./algolia agents conversations purge <agent-id> --start-date 1970-01-01 -y
./algolia agents domains bulk-insert <agent-id> --domain a.example.com --domain b.example.com
./algolia agents keys list
./algolia agents keys list --show-secret --output json
./algolia agents internal status

⚠️ Dependencies / risk

Reuses profile Application ID + Search API key; mutating agents keys verbs require an admin key (HTTP 403 is mapped deterministically).
ALGOLIA_AGENT_STUDIO_URL is optional—when unset the CLI follows the docs/agents.md resolver (profile region → baked default URL → proxy fallback).

Introduces a stand-alone, hand-written client in api/agentstudio/ that later cmd/agent-studio/* commands will sit on top of. - host.go: resolve base URL from --agent-studio-url override, region (eu/us, staging only in eu), or {appID}.algolia.net/agent-studio cluster-proxy fallback. - types.go: Go structs for Agent and PaginatedAgentsResponse, matching the camelCase wire format produced by the conversational-ai backend. - errors.go: structured APIError + sentinels (ErrUnauthorized, ErrForbidden, ErrFeatureDisabled, ErrNotFound, ErrServer) so callers can branch without parsing strings. - client.go: ListAgents end-to-end, with X-Algolia-Application-Id / X-Algolia-API-Key / X-Algolia-User-ID headers and JSON-error parsing. Tests cover region+env+override+fallback resolution, request/response round-trips, pagination params, context cancellation, and every documented HTTP error mapping. Verified live against a beta app on the deployed conversational-ai backend; no factory wiring yet. A TODO marks future replacement with a generated client once the backend publishes one from specs/agent-studio/spec.yml. Co-authored-by: Cursor <cursoragent@cursor.com>

Wires a new agentstudio client into the factory and exposes the first two read-only commands. `list` supports --page, --per-page, --provider-id, and --output (table on TTY, JSON for piping). `get <id>` defaults to JSON. Host resolution uses, in order: 1. ALGOLIA_AGENT_STUDIO_URL env var, or the profile's "agent_studio_url" field 2. Build-time agentstudio.DefaultBaseURL (ldflag-injected from .env's ALGOLIA_AGENT_STUDIO_URL — internal beta builds set this to https://agent-studio.staging.eu.algolia.com) 3. Cluster-proxy fallback https://<appID>.algolia.net/agent-studio (recommended for production end-users — the application's own cluster routes the request to the right region, per the conversational-ai README) Verified live against the beta backend (staging EU) using only `task build` defaults; fixes a regression where 422 validation errors were surfaced with the generic "Input is invalid, see detail/body:" prefix instead of the structured FastAPI detail.msg — extractDetail now prefers structured detail[] over the generic message field. Co-authored-by: Cursor <cursoragent@cursor.com>

Six new mutating commands under `algolia agents`, completing parity with the conversational-ai backend's `/v1/agents/*` surface. Body for `create`/`update` is passed as opaque `json.RawMessage`. The AgentConfigCreate schema is large, deeply validated, and evolves frequently — the CLI is a pass-through; the backend validates; our 422-detail surfacing (Phase 2 fix) makes errors actionable. Mirroring the schema in Go would lie about parity and force a CLI release on every backend change. `--dry-run` shows the resolved request body in human mode (more useful than the existing `objects/update` count-only summary), and emits a structured `{action, request, source, bytes, body, dryRun}` envelope only when `--output` is *explicitly* set — `WithDefaultOutput ("json")` from the success path no longer steals the human dry-run path silently. `delete` follows `objects/delete` exactly for the confirmation contract: `-y/--confirm` skips the prompt; non-TTY without `--confirm` refuses with `cmdutil.FlagErrorf`; `--dry-run` GETs the agent and prints "would DELETE" without calling DELETE. Pre-fetching lets us show name + status both at the prompt and in dry-run output. DELETE handles 204 No Content correctly (no body decode); the other five mutating endpoints share `doAgentMutation` (POST/PATCH + JSON Agent response). Live-verified end-to-end against staging EU: 13/13 checks pass — real create, get, update (rename), duplicate, delete; publish correctly surfaces backend `409 Cannot publish an agent without a model or provider` for a model-less smoke agent. Co-authored-by: Cursor <cursoragent@cursor.com>

codacy-production · 2026-05-06T18:45:23Z

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 1390 complexity · 859 duplication

Metric Results

Complexity 1390

Duplication 859

View in Codacy

_{TIP This summary will be updated as you push new changes.}

Two new commands close the developer loop on the agents tree: `agents test` for iterating on an in-memory agent configuration without persisting it (backend special-cases agent_id="test"), and `agents run <id>` for invoking a persisted agent — the same call shape downstream apps make in production. The wire format is not standard SSE. Recon caught this before code was written: v5 IS standard SSE (`data: <json>\n\n` with a `[DONE]` sentinel), but v4 is a Vercel-AI bespoke protocol — `<type>:<json>\n` per line, no terminator, with 16 single-character type codes mapped to names (`0`=text, `9`=tool-call, `d`=finish-message, …). The `compatibilityMode` query parameter is mandatory server-side with no default; the CLI defaults to v5 (standard, easier to parse defensively) and exposes `--compatibility v4|v5`. The new `api/agentstudio/sse.go` parser sniffs the line prefix and handles both formats transparently — same approach the backend's own acceptance test helpers use. `client.Completions(...)` returns the raw `*http.Response` so the caller decides streaming vs buffered based on `Content-Type`. One method instead of two avoids duplicating request building for one endpoint. The error path closes the body explicitly (the existing `checkResponse` only drains a 64 KiB prefix). Streaming output is NDJSON regardless of TTY — one parsed event per line as `{"type":"...","data":{...}}`. Predictable; pipes cleanly into `jq -r 'select(.type=="text-delta") | .data.delta'` for the "just show me the text" use case. Rich TTY rendering (incremental token print + dim tool-calls) is deferred — not worth growing this PR another ~150 LOC for a UX nicety. `--no-stream` returns the buffered JSON response verbatim. `--dry-run` prints the resolved body and skips the HTTP call. `--message` is a one-shot convenience that wraps as `[{"role":"user","content":"<text>"}]`; `--input` reads a messages JSON array from a file or `-` (stdin); cobra's `MarkFlagsMutuallyExclusive` enforces exactly one. `signal.NotifyContext(SIGINT)` is wired locally inside `test/run`, not in `root.go`. Streaming commands are the only ones that need mid-call cancellation; touching the global root context risked silently regressing every other command's signal behavior. Backend's discriminated `messages` union is pre-validated client- side with a cheap `[]any` shape check so the obvious "passed an object instead of an array" case names the file in the error instead of producing a 422. Live-verified end-to-end against staging EU: 11/11 checks pass. The test app has no LLM provider configured, so a real completion can't run — but the 422s we get back ("Field required" from the test endpoint, "Agent is not published" from `agents run` against the existing draft agent) confirm URL routing, query params, headers, body framing, and 422-detail extraction all land where intended. The streaming parser path is unit-tested against a synthetic SSE source (8 cases covering both wire formats, malformed lines, large frames, [DONE], onEvent error propagation). Other pieces: - `NormalizeCompatibility` lives in `pkg/cmd/agents/shared/` (extracted on second use, same rule as `printDryRun` in Phase 3). - `BuildMessages` / `ReadJSONFile` / `MarshalCompletionBody` / `RenderCompletion` shared between the two commands; all unit- tested in isolation. Co-authored-by: Cursor <cursoragent@cursor.com>

…idation order Three pieces that round out the agents tree for review. AGENTS.md grew an "Agent Studio" section so the next contributor doesn't replay the recon: architecture, auth model, base-URL resolution priority chain, SSE wire format gotchas (v4 IS NOT standard SSE; `compatibilityMode` is mandatory; CLI defaults to v5), dry-run convention (`cmd.Flags().Changed("output")` gating, not `PrintFlags.HasStructuredOutput()`), and a brief on why no bespoke telemetry events were added. Two new e2e testscripts: - testscripts/agents/dry-run.txtar (ungated): the contract test that the dry-run paths still produce stable output through the full cobra tree under the e2e binary. Exercises create/update structured + human modes, test with --input/--message, run with the agent id in the path, and the flag-validation guards. Covers the create/update/test/run dry-run paths in one file (delete dry-run is exercised in unit tests; it doesn't fit the same shape because it pre-fetches the agent and so still needs network). - testscripts/agents/list.txtar (env-gated on ALGOLIA_AGENT_STUDIO_E2E=1): read-only smoke confirming the live backend still emits the wire format we parse against. Gated because the standard e2e test app doesn't necessarily have the gen_ai feature flag enabled — running this against the wrong app would 403 and fail the suite. To make the env-gating possible, registered an `envCondition` in e2e_test.go that exposes `[env:VAR]` and `[!env:VAR]` to all testscripts. The cli/go-internal testscript framework only ships GOOS/GOARCH conditions out of the box; this is a one-liner that treats "" and "0" as falsy (matches how callers actually set ALGOLIA_AGENT_STUDIO_E2E=1). Behavior fix discovered while writing dry-run.txtar: `--compatibility` was validated AFTER the dry-run short-circuit in `agents test` and `agents run`, so `--compatibility v9 --dry-run` silently produced a "valid-looking" preview for an invalid request. Validation now runs first; same rationale as rejecting --input + --message together at flag-parse time. The unit test that asserted "invalid value rejected" had been running in non-dry-run mode and thus already covered the new behavior; no test churn needed. Telemetry deliberately NOT added. The existing `Command Invoked` event from `root.PersistentPreRunE` carries `{command: cmd.CommandPath(), flags: [<changed flag names>]}` which already attributes per verb (`algolia agents create`) and per dry-run (flags contains "dry-run"). Adding bespoke `agents.<verb>.{start,success,error,dry_run}` events would be the first in-command telemetry calls in the entire codebase, diverging from convention for one feature only. Outcome (success/error) is a separate, all-commands refactor. Docs generator (`go run ./cmd/docs --app_data-path <dir>`) verified to serialize all 10 agents commands cleanly with the expected frontmatter / examples / usage strings. Output is consumed downstream by the docs site and is intentionally not committed. Live-CRUD testscript (create→get→update→delete chained on the same id) deferred — it needs an id-extraction helper in the testscript framework that we don't have yet. The shell smokes in tmp/qa/ already exercise that path against staging. Co-authored-by: Cursor <cursoragent@cursor.com>

User caught a real terminological collision in code review. The brief said "tools to test and debug their Agent Studio setups with --dry-run". I had read that as a literal flag spec on every relevant command. Cross-checking the backend disproved that read: - Strings `dry_run`/`dryRun`/`dry-run` in `algolia/conversational-ai@development`: zero hits. - The closest backend primitive is `AgentTestConfiguration` + the `agent_id="test"` route — a real completion against an in-memory configuration, nothing persisted. That route is what `agents test` (now renamed `agents try`) wraps. The whole command IS the conversational-ai-side "dry-run" — running unsaved config against the real backend. So `agents test --dry-run` read literally as "dry-run a dry-run", which is the awkward case the user pointed at. Resolution: 1. Rename `agents test` → `agents try`. "Try" is single-word (preserves the CLI-wide convention — zero hyphenated subcommands exist anywhere in the tree), reads as "experiment without commitment", and avoids the pytest/unit-test connotations of "test" that don't apply here. `try` is registered in `agents.go` via the `trycmd` import alias to avoid clashing with Go's `try` keyword candidate / common variable name. 2. Drop the `--dry-run` flag (and its branch in `runTryCmd`) from the renamed command. There's nothing useful to "preview without HTTP" — the command is already side-effect-free. Dropped the corresponding unit test and replaced it with a comment block explaining why the test no longer exists. 3. Keep `--dry-run` on `create / update / delete / run`. That flag is the standard CLI preview convention (matches `objects/update --dry-run`); it lives on commands that DO have side effects (`create`/`update`/`delete` mutate state, `run` invokes a persisted agent which logs conversations + analytics + tokens). No collision with the conversational-ai concept on any of those. 4. Updated `e2e/testscripts/agents/dry-run.txtar`: removed the `agents test --dry-run` cases, added a regression assertion that `agents try --dry-run` is rejected as `unknown flag` so a future contributor doesn't re-add it without seeing this rationale. New `try.txtar` covers the four flag-validation guards on `agents try` (missing --config, missing --input/-m, --input/-m together, invalid --compatibility) — none of which need network, all fire before client construction. 5. AGENTS.md grew an "On `--dry-run`" section spelling out the two distinct concepts and why they must stay separate. Future contributors see this before touching the agents tree. What I got wrong originally: I read the brief's "--dry-run" literally instead of as shorthand for "the test/debug capability the spec describes". Once `agents try` exists, the brief's intent is satisfied by the command itself; layering a CLI-convention `--dry-run` on top of it was duplicating the deliverable. The unit tests I'd written for `agents test --dry-run` had been passing because the dry-run branch was correctly implemented — not because it was the right thing to implement. Pattern noted: when a feature already satisfies an intent, double-check before adding a flag that "completes" it. Net diff: +28 / -378. The body-preview semantics are still available everywhere they make sense; only the redundant collision case is gone. Co-authored-by: Cursor <cursoragent@cursor.com>

Closes the four completion query-param gaps and the cache-invalidate endpoint flagged by the parity audit. Smallest delta of the planned phases — chosen to land first because it validates the per-phase-PR rhythm without introducing brand-new architectural surface. What's in: - `agents try` and `agents run` learn `--no-cache`, `--no-memory`, `--no-analytics`, and `--secure-user-token <jwt>`. Maps onto the backend's documented completion query params and the X-Algolia-Secure-User-Token header. - New `agents cache invalidate <agent-id> [--before YYYY-MM-DD]` wraps `DELETE /1/agents/{id}/cache`. Mirrors `agents delete`'s confirmation contract: TTY prompts, non-TTY refuses without `--confirm`, `--dry-run` bypasses both since it's non-destructive. Polarity calls (worth knowing for future contributors): - The CompletionOptions.No* fields are inverted from the wire on purpose. Backend defaults all three to true; only the negative is interesting to expose at the CLI surface, and `memory=true` would actually 422 (the schema is `anyOf [{const false}, {type null}]`). So the wire form omits the param when the No* field is false and sends `<param>=false` when true. Polarity is pinned end-to-end by a table-driven wire-mapping test in completions_test.go and smaller cmd-level guards in try/run. - Date validation on `--before` is deliberately the backend's job. Mirroring Pydantic's date parser in Go would create silent skew on minor backend bumps; the 422-detail surfacing already turns bad input into an actionable message verbatim. Architecture: introduces the first nested cobra group under `agents` (`agents cache <verb>`). Phase 6+ reuse this for providers / conversations / keys / domains. Per-verb sub-package wasn't worth it for one verb; if `cache stats` etc. land later, split then. Refactor on second use deferred deliberately: 4 new flags duplicate mechanically across try.go and run.go (8 lines per command). Could be extracted into a `RegisterCompletionFlags` helper, but try and run are exactly two consumers and the duplication is straight-line. If a third consumer appears, extract following the rule we've used since Phase 3 (`PrintDryRun`, `NormalizeCompatibility`). Coverage: - `api/agentstudio/`: 1 new client method (`InvalidateAgentCache`), extended `CompletionOptions`, table-driven wire-mapping test covering all four flag combinations + the secure-user-token header, four new cache-endpoint tests (no-before / with-before / 404→ErrNotFound / structured-422-detail surfaces verbatim). - `pkg/cmd/agents/cache/`: new package, 6 unit tests covering happy path, before-flag, dry-run with and without `--before`, missing agent-id, non-TTY-without-confirm, and 404 propagation. - `pkg/cmd/agents/try/` and `run/`: one targeted end-to-end test each, asserting cobra→opts→client wiring is intact for all four new flags (catches polarity transposition, missed forwarding). - e2e: new `cache.txtar` (5 contract checks); `dry-run.txtar` extended with a regression that the new `--no-*` flags are wire-only and don't leak into the dry-run body preview. Live verification deferred. Staging test app still has no LLM provider configured, so a green completion isn't possible — but cache invalidation doesn't depend on a provider, and the unit + e2e coverage is exhaustive for the wire shapes. A future Phase 6 commit that lands provider CRUD will revive the live-completion smoke. Strategy: per-phase PR from here on. This branch (`lab_week_3_phase5`) is cut from `lab_week_3` HEAD; the PR will target `cmarguta-alg:lab_week_3` so the diff is bounded to Phase 5 only. When PR algolia#212 merges upstream, this PR's base re-targets to `algolia:main`. Net diff: +390 / −21 across 10 files, 2 new files (cache package, cache.txtar). 13/13 agents packages green, all 4 e2e testscripts pass (try / cache / dry-run / list-skipped-as-gated), gofumpt clean, golangci-lint clean. Co-authored-by: Cursor <cursoragent@cursor.com>

Adds two new nested command groups under `algolia agents`, lifting the CLI from 9/41 to 19/41 documented Agent Studio endpoints (~46%): agents providers <list|get|create|update|delete|models> agents config <get|set> API client (api/agentstudio/): - providers.go: ListProviders/GetProvider/CreateProvider/UpdateProvider/ DeleteProvider + ListProviderModels (catalog) + ListModelsForProvider (configured-provider). Provider.Input is json.RawMessage to mirror the 6-way discriminated input union (openai/azure_openai/google_genai/ deepseek/openai_compatible/anthropic) — same rationale as Agent.Config. - configuration.go: GetConfiguration/UpdateConfiguration. Body kept as json.RawMessage for forward-compat even though today's schema is a single int (maxRetentionDays). CLI (pkg/cmd/agents/{providers,config}/): - providers.go: 6 verbs in one file extending the cache-group precedent (rather than per-verb subpackages). `models` is one command with two routes — no flag hits the static catalog, --provider-id hits the configured-provider listing. - mask.go: redacts `apiKey` to "***" by default in providers list/get/ create/update success output. --show-secret opts out for scripted exports. Sets the convention for Phase 8 (`agents keys`). - config.go: `set` accepts either `--retention-days N` or `-F file.json` (mutually exclusive). File path is forward-compat for future fields. Docs + tests: - AGENTS.md: updated sub-group listing, provider client conventions, and a new "Provider secret masking" section explaining the redaction contract. - 27 client tests + 16 cmd tests + 2 e2e testscripts (providers.txtar, config.txtar) covering flag validation, dry-run output shape, masking on/off, and route picking. - Live-verified against staging EU: config get returns the live retention default, providers list returns proper paginated shape on empty, models catalog returns ~50 entries across 4+ provider types. Co-authored-by: Cursor <cursoragent@cursor.com>

Refactor only — no behaviour changes. Both the API client layer and the providers command package now mirror the OpenAPI spec's tag structure (one source file per tag), which makes it easier to add a new endpoint or verb later: a single new file plus its sibling test file. api/agentstudio/ client.go 568 LOC monolith → 193 LOC infra-only (Config/Client/NewClient/setHeaders/checkResponse/ extractDetail/sentinelFor) agents.go NEW — Agents tag (List/Get/Create/Update/Delete/ Publish/Unpublish/Duplicate/InvalidateAgentCache + doAgentMutation/doAgentLifecycle helpers) completions.go NEW — Completions tag (Completions + boolToWire) agents_test.go (renamed from mutations_test.go) absorbs GetAgent_* from get_agent_test.go (deleted) and the ListAgents happy-path tests previously in client_test.go. client_test.go now holds only the cross-cutting infra tests (NewClient validation, checkResponse error mapping, ctx cancellation, header omission). The error-mapping suite still uses ListAgents as a vehicle since it's the simplest GET shape; the file header explains why. pkg/cmd/agents/providers/ providers.go 738 LOC → 93 LOC (parent + readBody/ctxOrBackground/ relTimeOrDash helpers) list.go, get.go, create.go, update.go, delete.go, models.go NEW — one file per verb, same package so MaskInput and the helpers stay un-exported providers_test.go shrinks to the test harness (newClientForServer + writeTempJSON); each verb gets a sibling _test.go with its own scenarios. AGENTS.md gains a "File organisation" section documenting both layers' tag-aligned conventions and the explicit decision NOT to promote sub-group verbs to their own subpackages. All gates green: gofumpt, go vet, go test ./..., e2e TestAgents/{cache, try,dry-run,config,providers}, task build, golangci-lint on touched dirs. Co-authored-by: Cursor <cursoragent@cursor.com>

Adds 5 endpoints under `agents conversations <verb>`: list, get, delete, purge, export. Brings CLI parity to 24/41 documented Agent Studio endpoints (~59%, ~71% of publicly visible). Client (api/agentstudio/conversations.go): - ListConversations → typed PaginatedConversationsResponse - GetConversation → json.RawMessage (full shape has discriminated message-role union; passthrough is honest) - DeleteConversation → 204 sentinel-mapped - PurgeConversations → 204 (DELETE on the list endpoint) - ExportConversations → json.RawMessage (spec leaves shape unspecified) ListConversationsParams.FeedbackVote is *int because nil = no filter while 0 (downvote) is a meaningful value. Regression-tested. Commands (pkg/cmd/agents/conversations/, per-verb files following the providers/ pattern from the previous refactor): - list: table view + --output json passthrough; supports --start-date/--end-date/--include-feedback/--feedback-vote. --feedback-vote requires --include-feedback (backend constraint, rejected at CLI for a clearer error). - get: raw JSON passthrough (pretty-printed by default, structured via --output json), --include-feedback toggle. - delete: single-record; mirrors `agents delete` confirm contract (--confirm required non-TTY, --dry-run previews). - purge: bulk; two extra guardrails on top of the confirm contract: (a) dateless purge requires explicit --all (otherwise a typo on --start-date would silently turn a range purge into a wipe-all); (b) --all is mutually exclusive with --start-date/--end-date. - export: --output-file writes raw bytes (jq-friendly), stdout pretty-prints; --start-date/--end-date scope. All verbs take agent-id as the first positional argument, matching `agents publish/run/cache invalidate`. Tests: - 14 client unit tests (incl. *int FeedbackVote zero-value regression, feature-disabled propagation, ID validation, dateless-purge wire shape). - 16 cmd-level tests (incl. dateless-purge guardrail rejection, --all + date mutex rejection, feedback-vote-without-include rejection, dry-run URL+scope shapes). - e2e/testscripts/agents/conversations.txtar (no registration needed — TestAgents glob-runs everything in testscripts/agents/). AGENTS.md gains a "Conversations: purge vs delete" section documenting both guardrails and why they live at the cmd layer (the wire client mirrors the spec). Gates green: gofumpt, go vet, go test ./... agent dirs (16/16 packages), e2e TestAgents/{cache,try,dry-run,config,providers,conversations}, golangci-lint, task build. Co-authored-by: Cursor <cursoragent@cursor.com>

Spec-vs-reality mismatch caught by Anya's Phase 7 live vet against staging EU. OpenAPI declares both `startDate` and `endDate` on `DELETE /1/agents/{id}/conversations` as `required: false`, which reads as "dateless purge wipes everything." The live backend disagrees: dateless DELETE returns `400 "At least one filter is required"`. The previously-shipped `--all` flag was therefore non-functional — every invocation produced a request the backend always rejected. Changes: - Remove `--all` (and the dead `PurgeOptions.All` field). - Require at least one of `--start-date`/`--end-date` at the CLI; error message names the backend constraint so future spec drift stays debuggable. - Document the open-ended-bound workaround (--start-date 1970-01-01) in help text for genuine wipe-all needs. Verified live (exit 0). - Tests realigned: drop the obsolete --all/dateless-without-all cases, add Test_runPurgeCmd_RefusesDateless and Test_runPurgeCmd_AcceptsOpenEndedStart. - e2e/testscripts/agents/conversations.txtar updated to match. - AGENTS.md gains an explicit "Spec vs reality on purge" callout with the failing wire trace from the vet. Lesson noted in tmp/lab_week_3_plan.md and tmp/qa/anya_phase7_vet/ REPORT.md: trust live curl over the FastAPI OpenAPI spec, which reflects the function signature rather than the validation logic. Gates green post-fix: gofumpt, go vet, conversations unit tests, TestAgents/conversations e2e, golangci-lint on agents tree, task build. Co-authored-by: Cursor <cursoragent@cursor.com>

Adds 6 endpoints under `agents domains <verb>`: list, get, create, delete, bulk-insert, bulk-delete. Per-agent CORS allowlist; agent ID positional first arg as in conversations/cache. Client (api/agentstudio/domains.go): - Typed AllowedDomain + AllowedDomainListResponse (no opaque JSON, no pagination — list returns {domains:[]} flat). - Bulk insert wants {domains:[]string}, bulk delete wants {domainIds:[]string} as a request body (DELETE with body). Commands (pkg/cmd/agents/domains/): - list, get, create (--domain), delete (-y/--dry-run), bulk-insert (--domain repeated or -F file), bulk-delete (--domain-id repeated or -F file). Bulk verbs accept a JSON array of strings via -F for scripted use; --domain* and -F are mutually exclusive. - Live-probed against staging EU before coding; wire shapes match the spec exactly. Gates: gofumpt, vet, unit, e2e (TestAgents/domains), lint, build. Co-authored-by: Cursor <cursoragent@cursor.com>

… 8b) 5 endpoints under `agents keys <verb>`: list, get, create, update, delete. Per-app scope, optional per-agent allowlist via agentIds. Mutating verbs require an admin API key. Live-probed against staging EU: GET /1/secret-keys returns 200 with empty list, all mutating verbs return 403 {"message":"Admin API key required."} with a non-admin key (extractDetail already maps {message} envelopes — gives ErrForbidden + clear detail). Documented in command long-help. MaskInput moved from pkg/cmd/agents/providers/ to pkg/cmd/agents/shared/ since two consumers now redact secrets (provider input.apiKey, secret-key Value). Adds MaskString for top-level scalar secrets. Update semantics: - --name and --agent-id are independently set; nil pointer = leave field unchanged in the PATCH body (matches OpenAPI's anyOf-null). - Repeated --agent-id replaces (not merges) the allowlist. - --agent-id="" clears the allowlist (empty strings filtered out; if all values are empty the resulting [] still gets sent through). Output safety: - SecretKey.Value is redacted as "***" by default in list/get/ create/update output. - --show-secret opts in to raw value (e.g. for piping into a vault). Gates: gofumpt, vet, unit (api+all agents pkgs), e2e (TestAgents/keys), lint, build. Co-authored-by: Cursor <cursoragent@cursor.com>

Two single-tag sub-groups: agents feedback create POST /1/feedback agents user-data get GET /1/user-data/{user_token} agents user-data delete DELETE /1/user-data/{user_token} Live-probed against staging EU. Wire shapes match the spec; the feedback endpoint returns 404 with {message:...} envelope for unknown message IDs (already mapped via extractDetail). Notes: - feedback validates locally: vote ∈ {0,1}, tags ≤10 each ≤50 chars, notes ≤1000 chars (matches OpenAPI bounds; saves a round-trip on common mistakes). - user-data is the GDPR right-to-access (get) and right-to-be- forgotten (delete) endpoints; delete is irreversible and erases across every agent in the app, so the prompt is explicit and --confirm is enforced in non-interactive shells. - GetUserData returns the full envelope with []json.RawMessage inner items (conversations have a discriminated message-role union, memories have an unspecified shape — same convention as GetConversation in Phase 7). Gates: gofumpt, vet, unit (api+all agents pkgs), e2e (TestAgents/ feedback + TestAgents/userdata), lint, build. Co-authored-by: Cursor <cursoragent@cursor.com>

Closes 100% of the Agent Studio OpenAPI surface (41/41 endpoints). New top-level group (parent hidden, children visible): agents internal status GET /status agents internal memorize POST /1/agents/agents/{id}/memorize agents internal ponder POST /1/agents/agents/{id}/ponder agents internal consolidate POST /1/agents/agents/{id}/consolidate New providers verb: agents providers defaults GET /1/providers/models/defaults Path quirk verified live: memorize/ponder/consolidate live under the doubled prefix /1/agents/agents/ (the spec is correct, not a typo). The single-prefix equivalents 404 with FastAPI's standard {detail: "Not Found"} envelope; the doubled prefix routes to the actual handler. Pinned in the godoc on api/agentstudio/internal.go. Hidden semantics: parent has Hidden:true so it doesn't pollute `agents --help`, but children are visible — running `agents internal --help` still surfaces what's available to anyone who knows to look. Pinned by a unit test. Memory verbs (memorize/ponder/consolidate) take the agent ID positional and a JSON body via --body or -F ("-" for stdin), mirroring the agents try input contract. Body shapes are evolving (messages is a v4/v5 discriminated role union per the spec) so the client passes the body through as-is and returns the response as json.RawMessage — no premature typing. defaults: small read returning {provider_type → recommended_model}. Sibling of `agents providers models` — different question, different shape, didn't fit as a flag on the existing models verb. Gates: gofumpt, vet, unit (api+all agents pkgs), e2e (TestAgents/*), lint, build. Co-authored-by: Cursor <cursoragent@cursor.com>

…ya F-1) Anya's Phase 8 vet caught an asymmetric 404 from the staging gateway: `agents user-data get` and `delete` against any token containing "/" return `agent studio: 404 Not Found: Not Found` while every other unknown token (including ones with +, ?, #, &, =) returns a clean 200 empty / 204. Root cause is server-side — the gateway decodes %2F before routing so the path no longer matches /1/user-data/{token}. The CLI does the right thing with url.PathEscape; the symptom is a misleading 404 that an operator processing a GDPR deletion request would reasonably read as "data was never stored." Fix: refuse "/"-bearing tokens at the CLI layer with a precise error message that names the gateway behavior, on both verbs. One shared const, two two-line guards. No behavior change for tokens without "/". Live-verified: `./algolia agents user-data get "ab/cd"` now prints the explicit message instead of "404 Not Found." Gates: gofumpt, vet, unit (api+all agents pkgs), e2e (TestAgents/userdata), lint, build. Co-authored-by: Cursor <cursoragent@cursor.com>

Three small follow-ups to PR algolia#212. * `agents try` / `agents run`: TTY-attached stdout now renders a human-readable transcript (text-deltas inline, tool calls/results as dim annotations, errors red) instead of NDJSON. Non-TTY paths keep emitting NDJSON unchanged so existing pipelines (jq, log capture, scripts) are not broken. New `--ndjson` flag forces NDJSON on a TTY for users who want machine output on screen. * `agents domains bulk-delete`: now pre-fetches the existing list (TTY-only, one extra GET) so the success line can split requested IDs into removed-vs-already-absent. Non-TTY behavior unchanged. Backend returns 204 with no body, so this is the only signal we can give users about whether their delete actually did anything. * `agents keys list`: lock the table-render mask path with two new tests (default masks, --show-secret reveals). Existing tests only covered --output json — the table writes k.Value verbatim, so the mask must apply before the row is added. Co-authored-by: Cursor <cursoragent@cursor.com>

…ents Pulls the long-form Agent Studio rationale (file layout, auth, host resolution, streaming protocols, dry-run semantics, secret masking, backend gotchas) out of multi-paragraph in-source comments and out of AGENTS.md, into a single reference at docs/agents.md. Source files now keep one-line godoc comments only; gotchas that genuinely have to live next to the code (purge requires a date, slash in user-tokens, doubled memory-op path, memory polarity, ACL on configuration) are kept as 1-2 line hints with a pointer to docs/agents.md. AGENTS.md drops the agent-studio sections and links to the doc. Net: -945 / +219 lines across 25 files, no behavior change. Docs generator + full test sweep + gofumpt all clean. Co-authored-by: Cursor <cursoragent@cursor.com>

Five low-risk DRY passes across `pkg/cmd/agents/`: - `newClientForServer` × 14 test files → `sharedtest.NewClient` (new `pkg/cmd/agents/sharedtest/` package, kept out of production imports). `writeTempJSON` × 6 → `sharedtest.WriteTempJSON`. - `ctxOrBackground` × 7 sub-group files → `shared.OrBackground` and inline `if ctx == nil { ctx = context.Background() }` blocks in delete/cache/try/run. - `--confirm` / "non-TTY refuses without -y" pattern × 9 commands → `shared.AddConfirmFlag` + `shared.ResolveConfirm` + `shared.Confirm`. - File-reading helpers (`readBody`, `readJSONBody`, inline `cmdutil.ReadFile + TrimUTF8BOM + json.Valid` in config/update/create/domains) → `shared.ReadJSONFile`. - Completion-flag block (8 flags + the input/message mutex) duplicated in `try` and `run` → `shared.CompletionInputs` + `shared.RegisterCompletionFlags`. Net: 102 files changed, +587 / -770. Behavior unchanged; full suite green, golangci-lint clean. Co-authored-by: Cursor <cursoragent@cursor.com>

Allow common provider flows without -F: --name/--provider plus --api-key, --api-key-stdin, or --api-key-env for create (openai, anthropic, google_genai, deepseek); patches via --name, API key flags, and optional --base-url for update. Azure and openai_compatible still require JSON files. Keeps file and shortcut paths mutually exclusive. Updates agents e2e providers.txtar for the relaxed -F requirement. Co-authored-by: Cursor <cursoragent@cursor.com>

Adds docs/agents.md section on providers -F vs flags; links to docs/qa/arg_friendly_providers_SIGNOFF.md (Lab Week planning-team alignment + ordered test commands). Co-authored-by: Cursor <cursoragent@cursor.com>

Co-authored-by: Cursor <cursoragent@cursor.com>

tmp is cleaned by Taskfile and may contain local QA Go packages outside review; use issues.exclude-dirs instead of deprecated run.skip-dirs. Co-authored-by: Cursor <cursoragent@cursor.com>

Reuse build ldflags (including Agent Studio URL from .env); output binary algolia-beta and default VERSION main+beta. Document in AGENTS and README. Co-authored-by: Cursor <cursoragent@cursor.com>

Inject version.Distribution via task build-beta ldflags (empty for releases). PersistentPreRunE + help wrapper on agents group warns on stderr; defaults stay on stdout pipelines. Docs + DIST template in Taskfile. Co-authored-by: Cursor <cursoragent@cursor.com>

Remove docs/qa sign-off sheet; trim agents.md link. Strip tmp/qa, named QA, and Phase N callouts from tests and e2e scripts while keeping technical rationale. Co-authored-by: Cursor <cursoragent@cursor.com>

Strip the experimental --dry-run flag and preview branches from every agents subcommand. ResolveConfirm now only reflects TTY versus --confirm. Remove shared PrintDryRun and fold body helpers down to SourceLabel and TrimUTF8BOM. Refresh docs/agents.md, AGENTS.md, and README wording for beta builds. Rewrite e2e testscripts/agents cases and drop agents dry-run smoke. Align try help/comments and unit tests. Co-authored-by: Cursor <cursoragent@cursor.com>

NatanTechofNY · 2026-05-08T16:58:35Z

🤖 AI-Generated Code Review by Claude

PR #212 Code Review

Overview

Large new feature PR (+13,183 lines) adding the complete algolia agents command tree backed by a new api/agentstudio typed client.

Overall: Well-structured addition with good test coverage, sensible guardrails (secret masking, bounded purge, path-escaping), and clear documentation. A few security and reliability concerns in the transport layer and URL-construction code warrant attention before shipping.

🚨 HIGH ISSUES

1. `--api-key` Flag Persists LLM Provider Key to Shell History

File: pkg/cmd/agents/providers/create.go (+ update.go)
Issue: The --api-key <key> flag accepts a raw plaintext secret on the command line. Any value passed this way lands in shell history (~/.zsh_history, ~/.bash_history), is visible in /proc/<pid>/cmdline on Linux for the lifetime of the process, and appears in ps aux output.
Risk: Read access to the user's home directory or process table is sufficient to harvest the key. Safe alternatives (--api-key-stdin, --api-key-env) already exist in this PR.
Required Actions:
- Consider dropping --api-key entirely and only keeping --api-key-stdin / --api-key-env (same decision made by many CLIs that handle credentials — e.g. gh, docker login).
- At minimum, emit a deprecation/security warning to stderr whenever --api-key is used directly.

2. URL Override Accepts `http://` — API Keys Sent Unencrypted, SSRF Risk

File: api/agentstudio/host.go — ResolveHost
Issue: The Override path returns the value verbatim after TrimRight. No scheme validation is performed. A misconfigured profile or injected env var can point ALGOLIA_AGENT_STUDIO_URL at http://attacker.example.com, causing every request (including those carrying X-Algolia-API-Key) to be sent unencrypted. Locally, http://169.254.169.254/ and similar IMDS endpoints are also reachable.
Risk: Credential exfiltration over plaintext HTTP; internal network probing (SSRF) on cloud VMs.
Required Actions:
- Reject overrides that don't use https://. Return an error: if !strings.HasPrefix(host, "https://") { return "", fmt.Errorf("agent_studio_url must use https, got %q", host) }.
- If http:// needs to stay for local dev, gate it behind an explicit --insecure flag with a loud stderr warning.

3. ApplicationID Embedded Unsanitised in Cluster-Proxy URL

File: api/agentstudio/host.go — cluster-proxy fallback
Issue: The fallback path is "https://" + appID + ".algolia.net/agent-studio". appID is only whitespace-trimmed — no format validation. A value containing # (e.g. evil.com#) makes Go's URL parser treat .algolia.net/agent-studio as a fragment, connecting to evil.com instead. Similarly a value with / or ? redirects the request while leaving .algolia.net in the path/query.
Risk: If ApplicationID can be set from an untrusted config source or env var injection, all API calls (including credentials) are silently sent to the injected host.
Required Actions:
- Validate ApplicationID against the expected format before URL construction: regexp.MustCompile("^[A-Z0-9]{8,12}$").MatchString(appID).
- Or construct the URL with url.URL{} fields so special characters are rejected/encoded and the resulting host is always <appID>.algolia.net.

4. Streaming Completions Have No Idle Read Deadline

File: api/agentstudio/completions.go + api/agentstudio/sse.go
Issue: Completions returns the raw *http.Response for the caller to stream. If the caller provides context.Background() or a context without a deadline, bufio.Scanner.Scan() inside ParseStream will block forever when the server opens a connection, sends a few events, then stalls (network partition, upstream crash, proxy mis-configuration). http.DefaultClient has no fallback timeout either.
Risk: CLI hangs indefinitely — only killable with SIGKILL. Particularly bad in unattended CI/CD pipelines.
Required Actions:
- Document (in godoc + docs/agents.md) that callers must supply a context with a deadline when calling Completions.
- Consider wrapping resp.Body in a deadline-rearming reader, or use a custom http.Transport with ResponseHeaderTimeout + IdleConnTimeout set.

⚡ PERFORMANCE / RELIABILITY CONCERNS

5. `http.DefaultClient` Used as Fallback — No Timeout, Shared Global State

File: api/agentstudio/client.go — NewClient
Issue: if cfg.HTTPClient == nil { cfg.HTTPClient = http.DefaultClient }. http.DefaultClient.Timeout is zero (no timeout). It's also a package-level global — any other package that mutates it will silently affect agentstudio requests.
Risk: Non-streaming requests (list, get, create, etc.) can hang indefinitely on slow servers. Shared state creates fragile coupling.
Fix: Always construct a private *http.Client with an explicit Timeout (e.g. 30 s) for non-streaming calls. For streaming, use a custom http.Transport with dial/TLS/header timeouts only, leaving Client.Timeout at zero.

6. SSE Scanner Buffer Allows 5 MB Per Line; Response Body Ownership is Implicit

File: api/agentstudio/sse.go + api/agentstudio/completions.go
Issue: const maxLine = 1024 * 5120 (5 MB). A single large SSE frame causes the scanner's backing buffer to grow to 5 MB and stay there for the full stream duration (bufio.Scanner does not shrink). Combined with Completions returning a raw *http.Response, a caller that abandons streaming mid-way without calling Close() leaks the TCP connection until GC finalises the response.
Risk: Memory pressure on long streams with large tool-result blobs; connection pool exhaustion in loop scripts.
Fix: Lower maxLine to a realistic cap (e.g. 512 KB) if 5 MB payloads aren't expected. Add a clear godoc note on Completions that the caller must Close() the response body.

🛠 BEST-PRACTICES NOTES

7. Silent SSE Frame Skipping Produces No Observable Signal

File: api/agentstudio/sse.go — ParseStream
Issue: Unparseable or unknown-type frames are silently dropped (if !ok { continue }). The rationale ("backends evolve faster than parsers") is sound for unknown future event types, but makes it impossible to distinguish "stream contained 0 useful frames" from "every frame was malformed due to a backend regression."
Fix: Accept an optional onUnknown func(raw string) callback, or count skipped frames and surface the count via a verbose/debug flag. A --debug path writing skipped raw lines to stderr would catch silent protocol mismatches immediately.

8. Agent E2E Tests Gated Behind `ALGOLIA_AGENT_STUDIO_E2E` — Will Bit-Rot in CI

File: e2e/e2e_test.go + e2e/testscripts/agents/
Issue: The live-agent e2e tests require a dedicated ALGOLIA_AGENT_STUDIO_E2E=1 env var separate from the ALGOLIA_APPLICATION_ID gate used by every other e2e test in the repo. These tests are almost certainly not wired into any CI job and will accumulate drift against the real API.
Fix: Follow the existing e2e pattern — gate on ALGOLIA_APPLICATION_ID being set (or a single shared flag). Document in CONTRIBUTING how to run agent e2e tests and add a CI step or release checklist item.

9. User Tokens Containing `/` Should Be Hard-Rejected in `DeleteUserData`

File: api/agentstudio/userdata.go
Issue: The docs note that user tokens with / produce an advisory exit, but the guard is documentation-only — url.PathEscape encodes / as %2F, which some reverse proxies (nginx merge_slashes, AWS ALB) silently decode before routing, treating it as a path component. For a destructive endpoint, a token like ../agents/some-agent could resolve to an unintended resource under a permissive gateway.
Fix: Add a hard rejection in both GetUserData and DeleteUserData: if strings.Contains(userToken, "/") { return fmt.Errorf("user token must not contain '/'") }. Don't rely on documentation for a destructive path.

✅ STRENGTHS

Secret masking: Provider and key secrets masked by default with --show-secret opt-in — well-implemented and tested across both TTY and JSON output paths.
Path escaping: url.PathEscape consistently applied to all ID parameters with explicit tests for edge-case IDs.
Bounded purge: Date-range requirement on conversation purge caught at flag-parse time with a clear test.
SSE protocol normalisation: The v4/v5 compatibility shim is clean and well-tested with explicit regression cases for query-parameter polarity.
Error sentinel hierarchy: errors.Is(err, ErrNotFound) chains via APIError.Unwrap() work cleanly and are tested end-to-end.

📋 BEFORE MERGE

Must Have:

Validate URL override scheme (https:// only) — host.go
Validate/sanitise ApplicationID format before cluster-proxy URL construction — host.go

Should Have:
3. Remove or warn on --api-key plaintext flag — providers/create.go, providers/update.go
4. Add read deadline or document context requirement for streaming completions — completions.go
5. Hard-reject / in user tokens in DeleteUserData — userdata.go
6. Align agent e2e tests with existing CI gating pattern — e2e/

🎯 RECOMMENDATION

Conditional Approval — Strong foundation with excellent test coverage and good UX guardrails. The URL-construction issues in host.go (items 2 & 3) are the only ones that represent a genuine pre-ship security risk given that ALGOLIA_AGENT_STUDIO_URL / agent_studio_url are user-configurable. Everything else is hardening and polish.

Scores:

Code Quality: 8/10
Security: 6/10
Performance: 7/10
Multi-Tenancy: 8/10

🤖 Generated by Claude — Automated review focusing on security, performance, and multi-tenancy

cmarguta-alg and others added 3 commits May 6, 2026 15:23

cmarguta-alg and others added 4 commits May 6, 2026 22:02

cmarguta-alg mentioned this pull request May 6, 2026

Phase 5: agents completion runtime knobs + cache invalidate cmarguta-alg/cli#1

Merged

7 tasks

cmarguta-alg and others added 9 commits May 6, 2026 23:52

cmarguta-alg changed the title ~~feat: add agents commands for Algolia Agent Studio~~ feat(agents): add Algolia Agent Studio command tree (full API surface) May 7, 2026

cmarguta-alg and others added 6 commits May 7, 2026 13:03

docs: provider flag ergonomics + QA sign-off sheet for Anya

750f1d2

Adds docs/agents.md section on providers -F vs flags; links to docs/qa/arg_friendly_providers_SIGNOFF.md (Lab Week planning-team alignment + ordered test commands). Co-authored-by: Cursor <cursoragent@cursor.com>

docs(qa): keep Anya's vet report local under tmp/, not committed

ee6e906

Co-authored-by: Cursor <cursoragent@cursor.com>

cmarguta-alg mentioned this pull request May 7, 2026

feat(agents): Agent Studio command tree (+ provider CLI flags) #214

Closed

cmarguta-alg and others added 2 commits May 7, 2026 15:25

style: gofumpt providers flag helpers and tests

abc737d

Co-authored-by: Cursor <cursoragent@cursor.com>

chore(lint): exclude tmp from golangci paths

ea5d644

tmp is cleaned by Taskfile and may contain local QA Go packages outside review; use issues.exclude-dirs instead of deprecated run.skip-dirs. Co-authored-by: Cursor <cursoragent@cursor.com>

cmarguta-alg mentioned this pull request May 7, 2026

feat(agents): providers create/update CLI flags (stacked on lab_week_3) cmarguta-alg/cli#4

Merged

build: add task build-beta for algolia-beta side-by-side installs

aeb52d9

Reuse build ldflags (including Agent Studio URL from .env); output binary algolia-beta and default VERSION main+beta. Document in AGENTS and README. Co-authored-by: Cursor <cursoragent@cursor.com>

cmarguta-alg and others added 4 commits May 7, 2026 17:26

Merge branch 'main' into lab_week_3

4712408

chore(agents): drop internal QA doc and phase-scaffolding comments

a9724a9

Remove docs/qa sign-off sheet; trim agents.md link. Strip tmp/qa, named QA, and Phase N callouts from tests and e2e scripts while keeping technical rationale. Co-authored-by: Cursor <cursoragent@cursor.com>

cmarguta-alg marked this pull request as ready for review May 8, 2026 14:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agents): add Algolia Agent Studio command tree (full API surface)#212

feat(agents): add Algolia Agent Studio command tree (full API surface)#212
cmarguta-alg wants to merge 29 commits intoalgolia:mainfrom
cmarguta-alg:lab_week_3

cmarguta-alg commented May 6, 2026 •

edited

Loading

Uh oh!

codacy-production Bot commented May 6, 2026 •

edited

Loading

Uh oh!

NatanTechofNY commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cmarguta-alg commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🤔 What

🤷 Why

🛠 How

🧪 Testing

⚠️ Dependencies / risk

Uh oh!

codacy-production Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Up to standards ✅

Uh oh!

NatanTechofNY commented May 8, 2026

PR #212 Code Review

Overview

🚨 HIGH ISSUES

1. --api-key Flag Persists LLM Provider Key to Shell History

2. URL Override Accepts http:// — API Keys Sent Unencrypted, SSRF Risk

3. ApplicationID Embedded Unsanitised in Cluster-Proxy URL

4. Streaming Completions Have No Idle Read Deadline

⚡ PERFORMANCE / RELIABILITY CONCERNS

5. http.DefaultClient Used as Fallback — No Timeout, Shared Global State

6. SSE Scanner Buffer Allows 5 MB Per Line; Response Body Ownership is Implicit

🛠 BEST-PRACTICES NOTES

7. Silent SSE Frame Skipping Produces No Observable Signal

8. Agent E2E Tests Gated Behind ALGOLIA_AGENT_STUDIO_E2E — Will Bit-Rot in CI

9. User Tokens Containing / Should Be Hard-Rejected in DeleteUserData

✅ STRENGTHS

📋 BEFORE MERGE

🎯 RECOMMENDATION

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cmarguta-alg commented May 6, 2026 •

edited

Loading

codacy-production Bot commented May 6, 2026 •

edited

Loading

1. `--api-key` Flag Persists LLM Provider Key to Shell History

2. URL Override Accepts `http://` — API Keys Sent Unencrypted, SSRF Risk

5. `http.DefaultClient` Used as Fallback — No Timeout, Shared Global State

8. Agent E2E Tests Gated Behind `ALGOLIA_AGENT_STUDIO_E2E` — Will Bit-Rot in CI

9. User Tokens Containing `/` Should Be Hard-Rejected in `DeleteUserData`