Planner read-only network access via a ToolSet capability axis#14
Merged
Conversation
The planner runs read-only (`--permission-mode plan`) and autonomous (stdin closed, no ask_user MCP). Plan mode gates WebFetch and gh/curl behind a permission prompt the autonomous agent can't answer, so every issue/PR fetch died. Layer a scoped read-only network allowlist (WebFetch, WebSearch, and non-mutating `gh` subcommands) onto plan mode via `--allowedTools`; plan mode still hard-blocks every edit. Verified against claude 2.1.170 that `--allowedTools` pre-approves tools under plan mode with stdin closed. Also fix brief.md: the prompt claimed tasks are implemented by "separate coding agents," but every flow mints one session and reuses it across all tasks via implementTaskLoop. Made it singular. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Introduce `enum ToolSet { ReadOnly, NetworkOnly, Full }` and replace the
boolean `LlmConfig.readOnly` with `tools: ToolSet`. The capability axis (which
tools exist) is now distinct from `autoApprove` (which auto-approve), and each
backend's *Args maps the three tiers with a single match instead of
`if readOnly … else autoApprove …`.
Pure refactor relative to master: `ReadOnly` emits the old read-only flags,
`Full` the old write-capable flags, and `NetworkOnly` behaves identically to
`ReadOnly` for now — the network capability is layered on per backend in the
following commits. Builder API gains `withTools` (primitive) plus the
`withReadOnly` / `withNetworkOnly` conveniences.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ClaudeBackend gains a `networkTools` allowlist (default `DefaultNetworkTools`: WebFetch/WebSearch + scoped `gh` reads incl `Bash(gh api:*)`) and `withNetworkTools`, threaded into ClaudeArgs. On `ToolSet.NetworkOnly`, claude now emits `--permission-mode plan --allowedTools <list>` — read-only network layered onto plan mode, which still hard-blocks general bash and every edit. The allowlist lives on the backend (claude-specific strings), exposed via `ClaudeTool.withNetworkTools`, so it stays off the shared `LlmConfig`. `Plan.autonomousResult` (from / assessThenPlan / triage) now selects `withNetworkOnly`, so autonomous planners can fetch the issue/PR they were pointed at. Reviewers, `reviewed`/`briefed`, reviewer-selection and lint keep plain `withReadOnly` — no network, hard no-edit everywhere. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Neither backend can grant network without a writable surface: pi has no web tool (network = the general `bash` tool), and codex has no read-only-with- network sandbox (network requires `workspace-write`). So on `NetworkOnly`: - pi: append `bash` to the `--tools` allowlist. - codex: emit `--full-auto` (non-interactive workspace-write) plus a global `-c sandbox_workspace_write.network_access=true` (placed before the `exec` subcommand, codex's top-level config slot). On these two backends the no-edit guarantee is therefore prompt-only — the planning prompts forbid edits. ReadOnly and Full tiers are unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Verified against the gemini CLI that `web_fetch` works under `--approval-mode yolo` but is blocked under `plan` in headless runs. So gemini `NetworkOnly` (= plan mode) grants no network beyond `ReadOnly`. opencode likewise gates shell on the read-only tiers (bash disabled), so no writable-shell network; its web tool may remain available but that's server-dependent and unverified. Correct the earlier "opencode/gemini already allow web in read-only" comments to reflect this: neither gets dedicated network on NetworkOnly (it behaves like ReadOnly), and those flows pre-fetch issue/PR context instead. No behavior change — the code already mapped both tiers to the safe read-only mode. Adds an opencode test pinning that NetworkOnly keeps bash disabled. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add explicit "do not edit / run mutating commands" lines to triage.md, review.md, and brief.md. triage.md is load-bearing: triage runs via the autonomous planner path (NetworkOnly), which is prompt-guarded on pi/codex; the other two are ReadOnly (hard) and get the line as defense-in-depth. Add ADR 0016 capturing the two-axis model (ToolSet capability vs AutoApprove prompting), the per-backend NetworkOnly mapping and guarantees, the verified gemini/opencode no-network finding, and the claude-local allowlist placement. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Tool table: add `withNetworkOnly` (all backends) and `withNetworkTools` (claude) to the method lists. - Coding-agent-tools section: reframe around the two axes — capability (`tools: ToolSet`, with the ReadOnly/NetworkOnly/Full tiers) vs prompting (`autoApprove`); document `withNetworkOnly`/`withNetworkTools` and the per-backend network guarantee, linking ADR 0016. - Planning grid: the autonomous planner is now read-only + network, not strictly read-only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Correction: my earlier "gemini gets no network" conclusion tested plan mode without `--allowed-tools`. Re-verified on the gemini CLI that plain plan mode blocks `web_fetch`, but `--approval-mode plan --allowed-tools web_fetch` runs it and returns content. So gemini can keep its hard no-edit guarantee AND get web reads — analogous to claude's allowlist. Map gemini NetworkOnly → `plan` + `--allowed-tools web_fetch` (no shell `gh`, so no authed GitHub, but web works). `--allowed-tools` is deprecated (gemini 1.0 → Policy Engine); migrate then. Update the ToolSet doc, ADR 0016, and README accordingly (gemini moves from the no-network bucket to hard + web). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replaces the boolean LlmConfig.readOnly with enum ToolSet { ReadOnly, NetworkOnly, Full } — the capability axis, distinct from AutoApprove (prompting). Only the autonomous planner path (from/assessThenPlan/triage) selects NetworkOnly; reviewers, reviewed/briefed, selection and lint stay ReadOnly (the hard no-edit gate Reviewers.scala relies on). Per-backend network and guarantees are in ADR 0016. claude keeps a hard no-edit guarantee (command-scoped --allowedTools, configurable via claude.withNetworkTools); pi/codex are prompt-only (network needs a writable surface); gemini/opencode get no network (verified gemini blocks web in plan mode) and pre-fetch context instead.