feat(agents): add Algolia Agent Studio command tree (full API surface)#212
feat(agents): add Algolia Agent Studio command tree (full API surface)#212cmarguta-alg wants to merge 29 commits intoalgolia:mainfrom
Conversation
Introduces a stand-alone, hand-written client in api/agentstudio/ that
later cmd/agent-studio/* commands will sit on top of.
- host.go: resolve base URL from --agent-studio-url override, region
(eu/us, staging only in eu), or {appID}.algolia.net/agent-studio
cluster-proxy fallback.
- types.go: Go structs for Agent and PaginatedAgentsResponse, matching
the camelCase wire format produced by the conversational-ai backend.
- errors.go: structured APIError + sentinels (ErrUnauthorized,
ErrForbidden, ErrFeatureDisabled, ErrNotFound, ErrServer) so callers
can branch without parsing strings.
- client.go: ListAgents end-to-end, with X-Algolia-Application-Id /
X-Algolia-API-Key / X-Algolia-User-ID headers and JSON-error parsing.
Tests cover region+env+override+fallback resolution, request/response
round-trips, pagination params, context cancellation, and every
documented HTTP error mapping. Verified live against a beta app on the
deployed conversational-ai backend; no factory wiring yet.
A TODO marks future replacement with a generated client once the
backend publishes one from specs/agent-studio/spec.yml.
Co-authored-by: Cursor <cursoragent@cursor.com>
Wires a new agentstudio client into the factory and exposes the first
two read-only commands. `list` supports --page, --per-page,
--provider-id, and --output (table on TTY, JSON for piping).
`get <id>` defaults to JSON.
Host resolution uses, in order:
1. ALGOLIA_AGENT_STUDIO_URL env var, or the profile's
"agent_studio_url" field
2. Build-time agentstudio.DefaultBaseURL (ldflag-injected from
.env's ALGOLIA_AGENT_STUDIO_URL — internal beta builds set
this to https://agent-studio.staging.eu.algolia.com)
3. Cluster-proxy fallback https://<appID>.algolia.net/agent-studio
(recommended for production end-users — the application's own
cluster routes the request to the right region, per the
conversational-ai README)
Verified live against the beta backend (staging EU) using only `task
build` defaults; fixes a regression where 422 validation errors were
surfaced with the generic "Input is invalid, see detail/body:" prefix
instead of the structured FastAPI detail.msg — extractDetail now
prefers structured detail[] over the generic message field.
Co-authored-by: Cursor <cursoragent@cursor.com>
Six new mutating commands under `algolia agents`, completing parity
with the conversational-ai backend's `/v1/agents/*` surface.
Body for `create`/`update` is passed as opaque `json.RawMessage`. The
AgentConfigCreate schema is large, deeply validated, and evolves
frequently — the CLI is a pass-through; the backend validates; our
422-detail surfacing (Phase 2 fix) makes errors actionable. Mirroring
the schema in Go would lie about parity and force a CLI release on
every backend change.
`--dry-run` shows the resolved request body in human mode (more
useful than the existing `objects/update` count-only summary), and
emits a structured `{action, request, source, bytes, body, dryRun}`
envelope only when `--output` is *explicitly* set — `WithDefaultOutput
("json")` from the success path no longer steals the human dry-run
path silently.
`delete` follows `objects/delete` exactly for the confirmation
contract: `-y/--confirm` skips the prompt; non-TTY without `--confirm`
refuses with `cmdutil.FlagErrorf`; `--dry-run` GETs the agent and
prints "would DELETE" without calling DELETE. Pre-fetching lets us
show name + status both at the prompt and in dry-run output.
DELETE handles 204 No Content correctly (no body decode); the other
five mutating endpoints share `doAgentMutation` (POST/PATCH + JSON
Agent response).
Live-verified end-to-end against staging EU: 13/13 checks pass —
real create, get, update (rename), duplicate, delete; publish
correctly surfaces backend `409 Cannot publish an agent without a
model or provider` for a model-less smoke agent.
Co-authored-by: Cursor <cursoragent@cursor.com>
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 1390 |
| Duplication | 859 |
TIP This summary will be updated as you push new changes.
Two new commands close the developer loop on the agents tree:
`agents test` for iterating on an in-memory agent configuration
without persisting it (backend special-cases agent_id="test"), and
`agents run <id>` for invoking a persisted agent — the same call
shape downstream apps make in production.
The wire format is not standard SSE. Recon caught this before code
was written: v5 IS standard SSE (`data: <json>\n\n` with a `[DONE]`
sentinel), but v4 is a Vercel-AI bespoke protocol — `<type>:<json>\n`
per line, no terminator, with 16 single-character type codes mapped
to names (`0`=text, `9`=tool-call, `d`=finish-message, …). The
`compatibilityMode` query parameter is mandatory server-side with
no default; the CLI defaults to v5 (standard, easier to parse
defensively) and exposes `--compatibility v4|v5`. The new
`api/agentstudio/sse.go` parser sniffs the line prefix and handles
both formats transparently — same approach the backend's own
acceptance test helpers use.
`client.Completions(...)` returns the raw `*http.Response` so the
caller decides streaming vs buffered based on `Content-Type`. One
method instead of two avoids duplicating request building for one
endpoint. The error path closes the body explicitly (the existing
`checkResponse` only drains a 64 KiB prefix).
Streaming output is NDJSON regardless of TTY — one parsed event per
line as `{"type":"...","data":{...}}`. Predictable; pipes cleanly
into `jq -r 'select(.type=="text-delta") | .data.delta'` for the
"just show me the text" use case. Rich TTY rendering (incremental
token print + dim tool-calls) is deferred — not worth growing this
PR another ~150 LOC for a UX nicety.
`--no-stream` returns the buffered JSON response verbatim.
`--dry-run` prints the resolved body and skips the HTTP call.
`--message` is a one-shot convenience that wraps as
`[{"role":"user","content":"<text>"}]`; `--input` reads a messages
JSON array from a file or `-` (stdin); cobra's
`MarkFlagsMutuallyExclusive` enforces exactly one.
`signal.NotifyContext(SIGINT)` is wired locally inside `test/run`,
not in `root.go`. Streaming commands are the only ones that need
mid-call cancellation; touching the global root context risked
silently regressing every other command's signal behavior.
Backend's discriminated `messages` union is pre-validated client-
side with a cheap `[]any` shape check so the obvious "passed an
object instead of an array" case names the file in the error
instead of producing a 422.
Live-verified end-to-end against staging EU: 11/11 checks pass.
The test app has no LLM provider configured, so a real completion
can't run — but the 422s we get back ("Field required" from the
test endpoint, "Agent is not published" from `agents run` against
the existing draft agent) confirm URL routing, query params,
headers, body framing, and 422-detail extraction all land where
intended. The streaming parser path is unit-tested against a
synthetic SSE source (8 cases covering both wire formats, malformed
lines, large frames, [DONE], onEvent error propagation).
Other pieces:
- `NormalizeCompatibility` lives in `pkg/cmd/agents/shared/`
(extracted on second use, same rule as `printDryRun` in Phase 3).
- `BuildMessages` / `ReadJSONFile` / `MarshalCompletionBody` /
`RenderCompletion` shared between the two commands; all unit-
tested in isolation.
Co-authored-by: Cursor <cursoragent@cursor.com>
…idation order
Three pieces that round out the agents tree for review.
AGENTS.md grew an "Agent Studio" section so the next contributor
doesn't replay the recon: architecture, auth model, base-URL
resolution priority chain, SSE wire format gotchas (v4 IS NOT
standard SSE; `compatibilityMode` is mandatory; CLI defaults to
v5), dry-run convention (`cmd.Flags().Changed("output")` gating,
not `PrintFlags.HasStructuredOutput()`), and a brief on why no
bespoke telemetry events were added.
Two new e2e testscripts:
- testscripts/agents/dry-run.txtar (ungated): the contract test
that the dry-run paths still produce stable output through the
full cobra tree under the e2e binary. Exercises create/update
structured + human modes, test with --input/--message, run with
the agent id in the path, and the flag-validation guards.
Covers the create/update/test/run dry-run paths in one file
(delete dry-run is exercised in unit tests; it doesn't fit the
same shape because it pre-fetches the agent and so still needs
network).
- testscripts/agents/list.txtar (env-gated on
ALGOLIA_AGENT_STUDIO_E2E=1): read-only smoke confirming the
live backend still emits the wire format we parse against.
Gated because the standard e2e test app doesn't necessarily
have the gen_ai feature flag enabled — running this against
the wrong app would 403 and fail the suite.
To make the env-gating possible, registered an `envCondition`
in e2e_test.go that exposes `[env:VAR]` and `[!env:VAR]` to all
testscripts. The cli/go-internal testscript framework only ships
GOOS/GOARCH conditions out of the box; this is a one-liner that
treats "" and "0" as falsy (matches how callers actually set
ALGOLIA_AGENT_STUDIO_E2E=1).
Behavior fix discovered while writing dry-run.txtar:
`--compatibility` was validated AFTER the dry-run short-circuit
in `agents test` and `agents run`, so `--compatibility v9
--dry-run` silently produced a "valid-looking" preview for an
invalid request. Validation now runs first; same rationale as
rejecting --input + --message together at flag-parse time. The
unit test that asserted "invalid value rejected" had been
running in non-dry-run mode and thus already covered the new
behavior; no test churn needed.
Telemetry deliberately NOT added. The existing `Command Invoked`
event from `root.PersistentPreRunE` carries
`{command: cmd.CommandPath(), flags: [<changed flag names>]}`
which already attributes per verb (`algolia agents create`) and
per dry-run (flags contains "dry-run"). Adding bespoke
`agents.<verb>.{start,success,error,dry_run}` events would be
the first in-command telemetry calls in the entire codebase,
diverging from convention for one feature only. Outcome
(success/error) is a separate, all-commands refactor.
Docs generator (`go run ./cmd/docs --app_data-path <dir>`)
verified to serialize all 10 agents commands cleanly with the
expected frontmatter / examples / usage strings. Output is
consumed downstream by the docs site and is intentionally not
committed.
Live-CRUD testscript (create→get→update→delete chained on the
same id) deferred — it needs an id-extraction helper in the
testscript framework that we don't have yet. The shell smokes in
tmp/qa/ already exercise that path against staging.
Co-authored-by: Cursor <cursoragent@cursor.com>
User caught a real terminological collision in code review.
The brief said "tools to test and debug their Agent Studio setups
with --dry-run". I had read that as a literal flag spec on every
relevant command. Cross-checking the backend disproved that read:
- Strings `dry_run`/`dryRun`/`dry-run` in
`algolia/conversational-ai@development`: zero hits.
- The closest backend primitive is `AgentTestConfiguration` +
the `agent_id="test"` route — a real completion against an
in-memory configuration, nothing persisted.
That route is what `agents test` (now renamed `agents try`) wraps.
The whole command IS the conversational-ai-side "dry-run" — running
unsaved config against the real backend. So `agents test --dry-run`
read literally as "dry-run a dry-run", which is the awkward case
the user pointed at.
Resolution:
1. Rename `agents test` → `agents try`. "Try" is single-word
(preserves the CLI-wide convention — zero hyphenated subcommands
exist anywhere in the tree), reads as "experiment without
commitment", and avoids the pytest/unit-test connotations of
"test" that don't apply here. `try` is registered in
`agents.go` via the `trycmd` import alias to avoid clashing
with Go's `try` keyword candidate / common variable name.
2. Drop the `--dry-run` flag (and its branch in `runTryCmd`) from
the renamed command. There's nothing useful to "preview without
HTTP" — the command is already side-effect-free. Dropped the
corresponding unit test and replaced it with a comment block
explaining why the test no longer exists.
3. Keep `--dry-run` on `create / update / delete / run`. That
flag is the standard CLI preview convention (matches
`objects/update --dry-run`); it lives on commands that DO have
side effects (`create`/`update`/`delete` mutate state, `run`
invokes a persisted agent which logs conversations + analytics
+ tokens). No collision with the conversational-ai concept on
any of those.
4. Updated `e2e/testscripts/agents/dry-run.txtar`: removed the
`agents test --dry-run` cases, added a regression assertion
that `agents try --dry-run` is rejected as `unknown flag` so
a future contributor doesn't re-add it without seeing this
rationale. New `try.txtar` covers the four flag-validation
guards on `agents try` (missing --config, missing --input/-m,
--input/-m together, invalid --compatibility) — none of which
need network, all fire before client construction.
5. AGENTS.md grew an "On `--dry-run`" section spelling out the
two distinct concepts and why they must stay separate. Future
contributors see this before touching the agents tree.
What I got wrong originally: I read the brief's "--dry-run"
literally instead of as shorthand for "the test/debug capability
the spec describes". Once `agents try` exists, the brief's intent
is satisfied by the command itself; layering a CLI-convention
`--dry-run` on top of it was duplicating the deliverable. The
unit tests I'd written for `agents test --dry-run` had been
passing because the dry-run branch was correctly implemented —
not because it was the right thing to implement. Pattern noted:
when a feature already satisfies an intent, double-check before
adding a flag that "completes" it.
Net diff: +28 / -378. The body-preview semantics are still
available everywhere they make sense; only the redundant
collision case is gone.
Co-authored-by: Cursor <cursoragent@cursor.com>
Closes the four completion query-param gaps and the cache-invalidate
endpoint flagged by the parity audit. Smallest delta of the planned
phases — chosen to land first because it validates the per-phase-PR
rhythm without introducing brand-new architectural surface.
What's in:
- `agents try` and `agents run` learn `--no-cache`, `--no-memory`,
`--no-analytics`, and `--secure-user-token <jwt>`. Maps onto the
backend's documented completion query params and the
X-Algolia-Secure-User-Token header.
- New `agents cache invalidate <agent-id> [--before YYYY-MM-DD]`
wraps `DELETE /1/agents/{id}/cache`. Mirrors `agents delete`'s
confirmation contract: TTY prompts, non-TTY refuses without
`--confirm`, `--dry-run` bypasses both since it's non-destructive.
Polarity calls (worth knowing for future contributors):
- The CompletionOptions.No* fields are inverted from the wire on
purpose. Backend defaults all three to true; only the negative is
interesting to expose at the CLI surface, and `memory=true` would
actually 422 (the schema is `anyOf [{const false}, {type null}]`).
So the wire form omits the param when the No* field is false and
sends `<param>=false` when true. Polarity is pinned end-to-end by
a table-driven wire-mapping test in completions_test.go and
smaller cmd-level guards in try/run.
- Date validation on `--before` is deliberately the backend's job.
Mirroring Pydantic's date parser in Go would create silent skew
on minor backend bumps; the 422-detail surfacing already turns
bad input into an actionable message verbatim.
Architecture: introduces the first nested cobra group under `agents`
(`agents cache <verb>`). Phase 6+ reuse this for providers /
conversations / keys / domains. Per-verb sub-package wasn't worth it
for one verb; if `cache stats` etc. land later, split then.
Refactor on second use deferred deliberately: 4 new flags duplicate
mechanically across try.go and run.go (8 lines per command). Could
be extracted into a `RegisterCompletionFlags` helper, but try and run
are exactly two consumers and the duplication is straight-line. If a
third consumer appears, extract following the rule we've used since
Phase 3 (`PrintDryRun`, `NormalizeCompatibility`).
Coverage:
- `api/agentstudio/`: 1 new client method (`InvalidateAgentCache`),
extended `CompletionOptions`, table-driven wire-mapping test
covering all four flag combinations + the secure-user-token
header, four new cache-endpoint tests (no-before / with-before /
404→ErrNotFound / structured-422-detail surfaces verbatim).
- `pkg/cmd/agents/cache/`: new package, 6 unit tests covering happy
path, before-flag, dry-run with and without `--before`, missing
agent-id, non-TTY-without-confirm, and 404 propagation.
- `pkg/cmd/agents/try/` and `run/`: one targeted end-to-end test
each, asserting cobra→opts→client wiring is intact for all four
new flags (catches polarity transposition, missed forwarding).
- e2e: new `cache.txtar` (5 contract checks); `dry-run.txtar`
extended with a regression that the new `--no-*` flags are
wire-only and don't leak into the dry-run body preview.
Live verification deferred. Staging test app still has no LLM
provider configured, so a green completion isn't possible — but
cache invalidation doesn't depend on a provider, and the unit + e2e
coverage is exhaustive for the wire shapes. A future Phase 6 commit
that lands provider CRUD will revive the live-completion smoke.
Strategy: per-phase PR from here on. This branch
(`lab_week_3_phase5`) is cut from `lab_week_3` HEAD; the PR will
target `cmarguta-alg:lab_week_3` so the diff is bounded to Phase 5
only. When PR algolia#212 merges upstream, this PR's base re-targets to
`algolia:main`.
Net diff: +390 / −21 across 10 files, 2 new files (cache package,
cache.txtar). 13/13 agents packages green, all 4 e2e testscripts
pass (try / cache / dry-run / list-skipped-as-gated), gofumpt
clean, golangci-lint clean.
Co-authored-by: Cursor <cursoragent@cursor.com>
Adds two new nested command groups under `algolia agents`, lifting the
CLI from 9/41 to 19/41 documented Agent Studio endpoints (~46%):
agents providers <list|get|create|update|delete|models>
agents config <get|set>
API client (api/agentstudio/):
- providers.go: ListProviders/GetProvider/CreateProvider/UpdateProvider/
DeleteProvider + ListProviderModels (catalog) + ListModelsForProvider
(configured-provider). Provider.Input is json.RawMessage to mirror
the 6-way discriminated input union (openai/azure_openai/google_genai/
deepseek/openai_compatible/anthropic) — same rationale as Agent.Config.
- configuration.go: GetConfiguration/UpdateConfiguration. Body kept as
json.RawMessage for forward-compat even though today's schema is a
single int (maxRetentionDays).
CLI (pkg/cmd/agents/{providers,config}/):
- providers.go: 6 verbs in one file extending the cache-group precedent
(rather than per-verb subpackages). `models` is one command with two
routes — no flag hits the static catalog, --provider-id hits the
configured-provider listing.
- mask.go: redacts `apiKey` to "***" by default in providers list/get/
create/update success output. --show-secret opts out for scripted
exports. Sets the convention for Phase 8 (`agents keys`).
- config.go: `set` accepts either `--retention-days N` or `-F file.json`
(mutually exclusive). File path is forward-compat for future fields.
Docs + tests:
- AGENTS.md: updated sub-group listing, provider client conventions,
and a new "Provider secret masking" section explaining the redaction
contract.
- 27 client tests + 16 cmd tests + 2 e2e testscripts (providers.txtar,
config.txtar) covering flag validation, dry-run output shape, masking
on/off, and route picking.
- Live-verified against staging EU: config get returns the live
retention default, providers list returns proper paginated shape on
empty, models catalog returns ~50 entries across 4+ provider types.
Co-authored-by: Cursor <cursoragent@cursor.com>
Refactor only — no behaviour changes. Both the API client layer and the
providers command package now mirror the OpenAPI spec's tag structure
(one source file per tag), which makes it easier to add a new endpoint
or verb later: a single new file plus its sibling test file.
api/agentstudio/
client.go 568 LOC monolith → 193 LOC infra-only
(Config/Client/NewClient/setHeaders/checkResponse/
extractDetail/sentinelFor)
agents.go NEW — Agents tag (List/Get/Create/Update/Delete/
Publish/Unpublish/Duplicate/InvalidateAgentCache +
doAgentMutation/doAgentLifecycle helpers)
completions.go NEW — Completions tag (Completions + boolToWire)
agents_test.go (renamed from mutations_test.go) absorbs GetAgent_*
from get_agent_test.go (deleted) and the ListAgents
happy-path tests previously in client_test.go.
client_test.go now holds only the cross-cutting infra tests
(NewClient validation, checkResponse error mapping,
ctx cancellation, header omission). The error-mapping
suite still uses ListAgents as a vehicle since it's
the simplest GET shape; the file header explains why.
pkg/cmd/agents/providers/
providers.go 738 LOC → 93 LOC (parent + readBody/ctxOrBackground/
relTimeOrDash helpers)
list.go, get.go, create.go, update.go, delete.go, models.go
NEW — one file per verb, same package so MaskInput
and the helpers stay un-exported
providers_test.go shrinks to the test harness (newClientForServer +
writeTempJSON); each verb gets a sibling _test.go
with its own scenarios.
AGENTS.md gains a "File organisation" section documenting both layers'
tag-aligned conventions and the explicit decision NOT to promote
sub-group verbs to their own subpackages.
All gates green: gofumpt, go vet, go test ./..., e2e TestAgents/{cache,
try,dry-run,config,providers}, task build, golangci-lint on touched
dirs.
Co-authored-by: Cursor <cursoragent@cursor.com>
Adds 5 endpoints under `agents conversations <verb>`: list, get, delete,
purge, export. Brings CLI parity to 24/41 documented Agent Studio
endpoints (~59%, ~71% of publicly visible).
Client (api/agentstudio/conversations.go):
- ListConversations → typed PaginatedConversationsResponse
- GetConversation → json.RawMessage (full shape has discriminated
message-role union; passthrough is honest)
- DeleteConversation → 204 sentinel-mapped
- PurgeConversations → 204 (DELETE on the list endpoint)
- ExportConversations → json.RawMessage (spec leaves shape unspecified)
ListConversationsParams.FeedbackVote is *int because nil = no filter
while 0 (downvote) is a meaningful value. Regression-tested.
Commands (pkg/cmd/agents/conversations/, per-verb files following the
providers/ pattern from the previous refactor):
- list: table view + --output json passthrough; supports
--start-date/--end-date/--include-feedback/--feedback-vote.
--feedback-vote requires --include-feedback (backend
constraint, rejected at CLI for a clearer error).
- get: raw JSON passthrough (pretty-printed by default,
structured via --output json), --include-feedback toggle.
- delete: single-record; mirrors `agents delete` confirm contract
(--confirm required non-TTY, --dry-run previews).
- purge: bulk; two extra guardrails on top of the confirm contract:
(a) dateless purge requires explicit --all (otherwise a
typo on --start-date would silently turn a range purge
into a wipe-all);
(b) --all is mutually exclusive with --start-date/--end-date.
- export: --output-file writes raw bytes (jq-friendly), stdout
pretty-prints; --start-date/--end-date scope.
All verbs take agent-id as the first positional argument, matching
`agents publish/run/cache invalidate`.
Tests:
- 14 client unit tests (incl. *int FeedbackVote zero-value regression,
feature-disabled propagation, ID validation, dateless-purge wire
shape).
- 16 cmd-level tests (incl. dateless-purge guardrail rejection,
--all + date mutex rejection, feedback-vote-without-include
rejection, dry-run URL+scope shapes).
- e2e/testscripts/agents/conversations.txtar (no registration needed
— TestAgents glob-runs everything in testscripts/agents/).
AGENTS.md gains a "Conversations: purge vs delete" section documenting
both guardrails and why they live at the cmd layer (the wire client
mirrors the spec).
Gates green: gofumpt, go vet, go test ./... agent dirs (16/16 packages),
e2e TestAgents/{cache,try,dry-run,config,providers,conversations},
golangci-lint, task build.
Co-authored-by: Cursor <cursoragent@cursor.com>
Spec-vs-reality mismatch caught by Anya's Phase 7 live vet against
staging EU.
OpenAPI declares both `startDate` and `endDate` on
`DELETE /1/agents/{id}/conversations` as `required: false`, which
reads as "dateless purge wipes everything." The live backend
disagrees: dateless DELETE returns
`400 "At least one filter is required"`.
The previously-shipped `--all` flag was therefore non-functional —
every invocation produced a request the backend always rejected.
Changes:
- Remove `--all` (and the dead `PurgeOptions.All` field).
- Require at least one of `--start-date`/`--end-date` at the CLI;
error message names the backend constraint so future spec drift
stays debuggable.
- Document the open-ended-bound workaround (--start-date 1970-01-01)
in help text for genuine wipe-all needs. Verified live (exit 0).
- Tests realigned: drop the obsolete --all/dateless-without-all
cases, add Test_runPurgeCmd_RefusesDateless and
Test_runPurgeCmd_AcceptsOpenEndedStart.
- e2e/testscripts/agents/conversations.txtar updated to match.
- AGENTS.md gains an explicit "Spec vs reality on purge" callout
with the failing wire trace from the vet.
Lesson noted in tmp/lab_week_3_plan.md and tmp/qa/anya_phase7_vet/
REPORT.md: trust live curl over the FastAPI OpenAPI spec, which
reflects the function signature rather than the validation logic.
Gates green post-fix: gofumpt, go vet, conversations unit tests,
TestAgents/conversations e2e, golangci-lint on agents tree,
task build.
Co-authored-by: Cursor <cursoragent@cursor.com>
Adds 6 endpoints under `agents domains <verb>`: list, get, create,
delete, bulk-insert, bulk-delete. Per-agent CORS allowlist; agent ID
positional first arg as in conversations/cache.
Client (api/agentstudio/domains.go):
- Typed AllowedDomain + AllowedDomainListResponse (no opaque JSON,
no pagination — list returns {domains:[]} flat).
- Bulk insert wants {domains:[]string}, bulk delete wants
{domainIds:[]string} as a request body (DELETE with body).
Commands (pkg/cmd/agents/domains/):
- list, get, create (--domain), delete (-y/--dry-run),
bulk-insert (--domain repeated or -F file), bulk-delete
(--domain-id repeated or -F file). Bulk verbs accept a JSON array
of strings via -F for scripted use; --domain* and -F are mutually
exclusive.
- Live-probed against staging EU before coding; wire shapes match
the spec exactly.
Gates: gofumpt, vet, unit, e2e (TestAgents/domains), lint, build.
Co-authored-by: Cursor <cursoragent@cursor.com>
… 8b)
5 endpoints under `agents keys <verb>`: list, get, create, update,
delete. Per-app scope, optional per-agent allowlist via agentIds.
Mutating verbs require an admin API key. Live-probed against staging
EU: GET /1/secret-keys returns 200 with empty list, all mutating
verbs return 403 {"message":"Admin API key required."} with a
non-admin key (extractDetail already maps {message} envelopes — gives
ErrForbidden + clear detail). Documented in command long-help.
MaskInput moved from pkg/cmd/agents/providers/ to pkg/cmd/agents/shared/
since two consumers now redact secrets (provider input.apiKey,
secret-key Value). Adds MaskString for top-level scalar secrets.
Update semantics:
- --name and --agent-id are independently set; nil pointer = leave
field unchanged in the PATCH body (matches OpenAPI's anyOf-null).
- Repeated --agent-id replaces (not merges) the allowlist.
- --agent-id="" clears the allowlist (empty strings filtered out;
if all values are empty the resulting [] still gets sent through).
Output safety:
- SecretKey.Value is redacted as "***" by default in list/get/
create/update output.
- --show-secret opts in to raw value (e.g. for piping into a vault).
Gates: gofumpt, vet, unit (api+all agents pkgs), e2e (TestAgents/keys),
lint, build.
Co-authored-by: Cursor <cursoragent@cursor.com>
Two single-tag sub-groups:
agents feedback create POST /1/feedback
agents user-data get GET /1/user-data/{user_token}
agents user-data delete DELETE /1/user-data/{user_token}
Live-probed against staging EU. Wire shapes match the spec; the
feedback endpoint returns 404 with {message:...} envelope for unknown
message IDs (already mapped via extractDetail).
Notes:
- feedback validates locally: vote ∈ {0,1}, tags ≤10 each ≤50
chars, notes ≤1000 chars (matches OpenAPI bounds; saves a
round-trip on common mistakes).
- user-data is the GDPR right-to-access (get) and right-to-be-
forgotten (delete) endpoints; delete is irreversible and erases
across every agent in the app, so the prompt is explicit and
--confirm is enforced in non-interactive shells.
- GetUserData returns the full envelope with []json.RawMessage
inner items (conversations have a discriminated message-role
union, memories have an unspecified shape — same convention as
GetConversation in Phase 7).
Gates: gofumpt, vet, unit (api+all agents pkgs), e2e (TestAgents/
feedback + TestAgents/userdata), lint, build.
Co-authored-by: Cursor <cursoragent@cursor.com>
Closes 100% of the Agent Studio OpenAPI surface (41/41 endpoints).
New top-level group (parent hidden, children visible):
agents internal status GET /status
agents internal memorize POST /1/agents/agents/{id}/memorize
agents internal ponder POST /1/agents/agents/{id}/ponder
agents internal consolidate POST /1/agents/agents/{id}/consolidate
New providers verb:
agents providers defaults GET /1/providers/models/defaults
Path quirk verified live: memorize/ponder/consolidate live under the
doubled prefix /1/agents/agents/ (the spec is correct, not a typo).
The single-prefix equivalents 404 with FastAPI's standard {detail:
"Not Found"} envelope; the doubled prefix routes to the actual
handler. Pinned in the godoc on api/agentstudio/internal.go.
Hidden semantics: parent has Hidden:true so it doesn't pollute
`agents --help`, but children are visible — running
`agents internal --help` still surfaces what's available to anyone
who knows to look. Pinned by a unit test.
Memory verbs (memorize/ponder/consolidate) take the agent ID
positional and a JSON body via --body or -F ("-" for stdin),
mirroring the agents try input contract. Body shapes are evolving
(messages is a v4/v5 discriminated role union per the spec) so the
client passes the body through as-is and returns the response as
json.RawMessage — no premature typing.
defaults: small read returning {provider_type → recommended_model}.
Sibling of `agents providers models` — different question, different
shape, didn't fit as a flag on the existing models verb.
Gates: gofumpt, vet, unit (api+all agents pkgs), e2e (TestAgents/*),
lint, build.
Co-authored-by: Cursor <cursoragent@cursor.com>
…ya F-1)
Anya's Phase 8 vet caught an asymmetric 404 from the staging gateway:
`agents user-data get` and `delete` against any token containing
"/" return `agent studio: 404 Not Found: Not Found` while every
other unknown token (including ones with +, ?, #, &, =) returns a
clean 200 empty / 204.
Root cause is server-side — the gateway decodes %2F before routing
so the path no longer matches /1/user-data/{token}. The CLI does the
right thing with url.PathEscape; the symptom is a misleading 404
that an operator processing a GDPR deletion request would reasonably
read as "data was never stored."
Fix: refuse "/"-bearing tokens at the CLI layer with a precise
error message that names the gateway behavior, on both verbs. One
shared const, two two-line guards. No behavior change for tokens
without "/".
Live-verified: `./algolia agents user-data get "ab/cd"` now
prints the explicit message instead of "404 Not Found."
Gates: gofumpt, vet, unit (api+all agents pkgs), e2e
(TestAgents/userdata), lint, build.
Co-authored-by: Cursor <cursoragent@cursor.com>
Three small follow-ups to PR algolia#212. * `agents try` / `agents run`: TTY-attached stdout now renders a human-readable transcript (text-deltas inline, tool calls/results as dim annotations, errors red) instead of NDJSON. Non-TTY paths keep emitting NDJSON unchanged so existing pipelines (jq, log capture, scripts) are not broken. New `--ndjson` flag forces NDJSON on a TTY for users who want machine output on screen. * `agents domains bulk-delete`: now pre-fetches the existing list (TTY-only, one extra GET) so the success line can split requested IDs into removed-vs-already-absent. Non-TTY behavior unchanged. Backend returns 204 with no body, so this is the only signal we can give users about whether their delete actually did anything. * `agents keys list`: lock the table-render mask path with two new tests (default masks, --show-secret reveals). Existing tests only covered --output json — the table writes k.Value verbatim, so the mask must apply before the row is added. Co-authored-by: Cursor <cursoragent@cursor.com>
…ents Pulls the long-form Agent Studio rationale (file layout, auth, host resolution, streaming protocols, dry-run semantics, secret masking, backend gotchas) out of multi-paragraph in-source comments and out of AGENTS.md, into a single reference at docs/agents.md. Source files now keep one-line godoc comments only; gotchas that genuinely have to live next to the code (purge requires a date, slash in user-tokens, doubled memory-op path, memory polarity, ACL on configuration) are kept as 1-2 line hints with a pointer to docs/agents.md. AGENTS.md drops the agent-studio sections and links to the doc. Net: -945 / +219 lines across 25 files, no behavior change. Docs generator + full test sweep + gofumpt all clean. Co-authored-by: Cursor <cursoragent@cursor.com>
Five low-risk DRY passes across `pkg/cmd/agents/`:
- `newClientForServer` × 14 test files → `sharedtest.NewClient`
(new `pkg/cmd/agents/sharedtest/` package, kept out of production
imports). `writeTempJSON` × 6 → `sharedtest.WriteTempJSON`.
- `ctxOrBackground` × 7 sub-group files → `shared.OrBackground` and
inline `if ctx == nil { ctx = context.Background() }` blocks in
delete/cache/try/run.
- `--confirm` / "non-TTY refuses without -y" pattern × 9 commands
→ `shared.AddConfirmFlag` + `shared.ResolveConfirm` +
`shared.Confirm`.
- File-reading helpers (`readBody`, `readJSONBody`, inline
`cmdutil.ReadFile + TrimUTF8BOM + json.Valid` in
config/update/create/domains) → `shared.ReadJSONFile`.
- Completion-flag block (8 flags + the input/message mutex)
duplicated in `try` and `run` → `shared.CompletionInputs` +
`shared.RegisterCompletionFlags`.
Net: 102 files changed, +587 / -770. Behavior unchanged; full suite
green, golangci-lint clean.
Co-authored-by: Cursor <cursoragent@cursor.com>
Allow common provider flows without -F: --name/--provider plus --api-key, --api-key-stdin, or --api-key-env for create (openai, anthropic, google_genai, deepseek); patches via --name, API key flags, and optional --base-url for update. Azure and openai_compatible still require JSON files. Keeps file and shortcut paths mutually exclusive. Updates agents e2e providers.txtar for the relaxed -F requirement. Co-authored-by: Cursor <cursoragent@cursor.com>
Adds docs/agents.md section on providers -F vs flags; links to docs/qa/arg_friendly_providers_SIGNOFF.md (Lab Week planning-team alignment + ordered test commands). Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
tmp is cleaned by Taskfile and may contain local QA Go packages outside review; use issues.exclude-dirs instead of deprecated run.skip-dirs. Co-authored-by: Cursor <cursoragent@cursor.com>
Reuse build ldflags (including Agent Studio URL from .env); output binary algolia-beta and default VERSION main+beta. Document in AGENTS and README. Co-authored-by: Cursor <cursoragent@cursor.com>
Inject version.Distribution via task build-beta ldflags (empty for releases). PersistentPreRunE + help wrapper on agents group warns on stderr; defaults stay on stdout pipelines. Docs + DIST template in Taskfile. Co-authored-by: Cursor <cursoragent@cursor.com>
Remove docs/qa sign-off sheet; trim agents.md link. Strip tmp/qa, named QA, and Phase N callouts from tests and e2e scripts while keeping technical rationale. Co-authored-by: Cursor <cursoragent@cursor.com>
Strip the experimental --dry-run flag and preview branches from every agents subcommand. ResolveConfirm now only reflects TTY versus --confirm. Remove shared PrintDryRun and fold body helpers down to SourceLabel and TrimUTF8BOM. Refresh docs/agents.md, AGENTS.md, and README wording for beta builds. Rewrite e2e testscripts/agents cases and drop agents dry-run smoke. Align try help/comments and unit tests. Co-authored-by: Cursor <cursoragent@cursor.com>
|
🤖 AI-Generated Code Review by Claude PR #212 Code ReviewOverviewLarge new feature PR (+13,183 lines) adding the complete Overall: Well-structured addition with good test coverage, sensible guardrails (secret masking, bounded purge, path-escaping), and clear documentation. A few security and reliability concerns in the transport layer and URL-construction code warrant attention before shipping. 🚨 HIGH ISSUES1.
|
🤔 What
Adds
algolia agentsbacked byapi/agentstudio: agent lifecycle (list,get,create,update,delete,publish,unpublish,duplicate), completions (runon a persisted id,tryon a local--configfile hitting/1/agents/test/completions), plusconversations,domains,providers,keys,feedback,user-data(aliasuserdata),cache,config, and hiddeninternal. Destructive or sensitive paths require--confirm/-yon non-TTY, same pattern as elsewhere in the CLI.🤷 Why
Agent Studio workflows are cumbersome to reproduce with one-off HTTP calls or dashboard-only flows; exposing them in the Algolia CLI makes automation, CI, and support playbooks repeatable.
🛠 How
api/agentstudiopkg/cmd/agents/*--show-secretto print raw values); guardrails spelled out indocs/agents.md(conversation purge bounds,/in user tokens, memory route prefix).AgentStudioClientoncmdutil.Factory;agent_studio_urlplusALGOLIA_AGENT_STUDIO_URL(seedocs/agents.md/ rootagents --helpfor precedence).e2e/testscripts/agents/*.txtarfor offline validation paths; gated liveagents listwhenALGOLIA_AGENT_STUDIO_E2E=1.Full
algolia agentsleaf commands — CobraUseLine+ command-scoped flags🧪 Testing
Quick checks
task test task lint task buildPackages
E2E (
TestAgents, offline scripts; synthetic app credentials suffice)E2E (add live
agents listenvelope check)Action / expectation
algolia agents conversations purge "<agent-id>" -yruns in a non-interactive shell without--start-dateor--end-date, then the process exits non-zero before a purge request leaves the CLI (bounded delete is mandatory).algolia agents user-data get 'ab/cd'(token contains/), then the CLI exits non-zero with the gateway-routing advisory (avoiding a bogus 404 from premature decoding).Optional staging / manual probes (substitute real ids and JSON paths)
agents keysverbs require an admin key (HTTP 403 is mapped deterministically).ALGOLIA_AGENT_STUDIO_URLis optional—when unset the CLI follows thedocs/agents.mdresolver (profile region → baked default URL → proxy fallback).