Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
06815ad
docs(reviews): design /release-review periodic full-codebase review
JohnMcLear May 9, 2026
1f10322
docs(reviews): plan /release-review implementation
JohnMcLear May 9, 2026
df99fc3
chore(reviews): scaffold /release-review directory structure
JohnMcLear May 10, 2026
9ebf9da
feat(reviews): add shared types for /release-review helpers
JohnMcLear May 10, 2026
ac0eb1f
feat(reviews): add fingerprint helper with whitespace-stable hashing
JohnMcLear May 10, 2026
b9c3405
chore(deps): add js-yaml as direct dep for release review suppression…
JohnMcLear May 10, 2026
597289f
feat(reviews): add known-findings.yml load/append with validation
JohnMcLear May 10, 2026
cd503c0
feat(reviews): add finding aggregation with dedupe and severity floor
JohnMcLear May 10, 2026
d095722
feat(reviews): add heuristic auto-triage classifier
JohnMcLear May 10, 2026
5f7b118
feat(reviews): add run-id generation and run-dir helpers
JohnMcLear May 10, 2026
53d330d
feat(reviews): add session summary writer
JohnMcLear May 10, 2026
364941d
feat(reviews): add CLI entry point for /release-review helpers
JohnMcLear May 10, 2026
b883273
feat(reviews): add Phase 1 tools subagent prompt
JohnMcLear May 10, 2026
4aa960b
feat(reviews): add Phase 2 auth-sessions subagent prompt
JohnMcLear May 10, 2026
8c6582c
feat(reviews): add Phase 2 realtime-api subagent prompt
JohnMcLear May 10, 2026
ef757ce
feat(reviews): add Phase 2 pad-changeset subagent prompt
JohnMcLear May 10, 2026
adc7403
feat(reviews): add Phase 2 db-supply subagent prompt
JohnMcLear May 10, 2026
fd0b814
fix(reviews): anchor realtime-api scope paths to {{repo_root}}
JohnMcLear May 10, 2026
1d87dfd
fix(reviews): anchor pad-changeset scope paths to {{repo_root}}
JohnMcLear May 10, 2026
d49de43
fix(reviews): anchor db-supply scope paths to {{repo_root}}
JohnMcLear May 10, 2026
83348bd
feat(reviews): add /release-review slash command orchestrator
JohnMcLear May 10, 2026
f04fc4b
fix(reviews): inline run-id in slash command (Bash tool has no shell-…
JohnMcLear May 10, 2026
9ec1291
docs(reviews): operator guide for /release-review
JohnMcLear May 10, 2026
f9f24b8
fix(reviews): drop redundant src/ prefix from cli.ts path (pnpm exec …
JohnMcLear May 10, 2026
8d9d8d9
fix(reviews): aggregate accepts repoRoot to resolve repo-relative paths
JohnMcLear May 10, 2026
a83054c
fix(reviews): include fingerprint/file/ruleId in all decision appends
JohnMcLear May 10, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
161 changes: 161 additions & 0 deletions .claude/commands/release-review.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
---
description: Run a full-codebase Medium+ review session for a release.
argument-hint: [--resume <run-id>]
---

# /release-review

Periodic full-codebase review. Three phases: deterministic tools, parallel AI subsystem sweeps, interactive Medium+ walkthrough.

**Spec:** `docs/superpowers/specs/2026-05-09-release-review-design.md`
**Plan:** `docs/superpowers/plans/2026-05-09-release-review.md`

## Argument parsing

If the user passed `--resume <run-id>`, set `RESUME=1` and `RUN_ID=<run-id>`. Skip Phase 1 and Phase 2 if `RESUME=1`. Otherwise allocate a fresh run-id (Phase 0 below).

## Phase 0 — Setup

Run:
```bash
pnpm --filter ep_etherpad-lite exec tsx node/utils/releaseReview/cli.ts next-run-id /tmp/release-review
```

This prints the next run-id (e.g. `run-2026-05-09-1`). **Remember the run-id in your conversation state** — Bash tool calls do not persist shell variables across invocations, so every later step that needs the run-id or run-dir path must inline the literal values.

Create the run-dir:

```bash
mkdir -p /tmp/release-review/<run-id>
```

Replace `<run-id>` with the actual id from above. State the run-id to the user before continuing.

## Phase 1 — Tool sweep (skip if --resume)

Read `docs/reviews/prompts/tools.md`. Substitute `{{run_id}}` and `{{repo_root}}` with the live values. Dispatch a single general-purpose Agent with the substituted prompt.

Block until the subagent completes. Verify `/tmp/release-review/<run-id>/tool-findings.json` exists (replacing `<run-id>` with the literal run-id from Phase 0). If it doesn't, surface the failure and ask the user whether to continue with Phase 2 only.

## Phase 2 — AI subsystem sweep (skip if --resume)

Read all four prompt files:
- `docs/reviews/prompts/auth-sessions.md`
- `docs/reviews/prompts/realtime-api.md`
- `docs/reviews/prompts/pad-changeset.md`
- `docs/reviews/prompts/db-supply.md`

For each prompt, substitute `{{run_id}}` (with the run-id from Phase 0) and `{{repo_root}}` (with the absolute path of the current repo, available via `git rev-parse --show-toplevel`). Then dispatch four general-purpose Agents IN PARALLEL — single message, four Agent tool calls, one per substituted prompt.

Block until all four complete. Verify each output JSON exists. For any that didn't run / failed, record `"missing: <name>"` in the merged report so the user knows coverage was partial.

## Phase 3 — Aggregate, suppress, triage, walk

### 3a. Aggregate

```bash
pnpm --filter ep_etherpad-lite exec tsx node/utils/releaseReview/cli.ts \
aggregate /tmp/release-review/<run-id> docs/reviews/known-findings.yml medium "$(git rev-parse --show-toplevel)"
```
(Replace `<run-id>` with the literal run-id from Phase 0.)
Reads all `*.json` from the run-dir except `merged.json` / `triage.json`. Writes `merged.json`.

### 3b. Triage

```bash
pnpm --filter ep_etherpad-lite exec tsx node/utils/releaseReview/cli.ts \
triage /tmp/release-review/<run-id>
```
(Replace `<run-id>` with the literal run-id from Phase 0.)
Writes `triage.json` with `{fixNow, issue, suppress}` buckets.

### 3c. First-run check

If `docs/reviews/known-findings.yml` has `findings: []` (empty list) AND `merged.json` has 20+ findings, this is a first run. Tell the user:

> First /release-review with no baseline. Found N Medium+ findings. Mark all as accepted-risk baseline (rationale: "baseline at <run-id>; not yet triaged") and only show new findings in future runs? [Y/n]

If Y: bulk-append all `merged.json` fingerprints to known-findings.yml via the `append-suppression` CLI command (one call per entry), exit cleanly.

If N: proceed to walkthrough below.

### 3d. Walkthrough

Read `triage.json`. Print the summary header:
```
Auto-triage of N Medium+ findings:
Fix now: X (will show patches)
Issue: Y (will draft GH issues)
Suppress: Z (will propose suppression entries)

Walking high-severity Fix-now first.
```

Track decisions in an array. For EACH finding, in this order:

1. **fixNow bucket, sorted by severity desc, then category rank**:
- Print: `[N/total] severity / category / file:line / ruleId / message`
- Read 5-line excerpt around the finding's line.
- Generate a patch using your understanding of the issue + remediationHint.
- Show the patch as a unified diff.
- Ask: "Apply? [Y/n/edit/skip]"
- On Y: apply via Edit. Append `{fingerprint, action: 'fix', file, ruleId}` to decisions.
- On n / skip: append `{fingerprint, action: 'skip', file, ruleId}`.
- On edit: ask for guidance, regenerate patch, re-prompt.

2. **issue bucket**:
- Print finding. Draft a GitHub issue body (Title: `<rule>: <one-liner>`. Body: severity, file:line, message, remediation, links).
- Ask: "Create issue / edit body / skip?"
- On create: run `gh issue create --title "..." --body-file -` (with confirmation). Capture issue URL. Append `{fingerprint, action: 'issue', file, ruleId, issueUrl}`.
- On skip: append `{fingerprint, action: 'skip', file, ruleId}`.
- If `gh` is missing: print the body, append `{action: 'skip', rationale: 'gh not available'}`.

3. **suppress bucket**:
- Print finding + propose `status: accepted-risk` (default) and a one-line rationale based on the message.
- Ask: "Accept / edit rationale / fix-instead / skip?"
- On accept: invoke `cli.ts append-suppression` with the entry. Append `{fingerprint, action: 'accepted-risk', file, ruleId, rationale}`.
- On skip: append `{fingerprint, action: 'skip', file, ruleId}`.
- On fix-instead: jump to the fixNow flow for this finding.

### 3e. End-of-session summary

Determine the version: read `package.json`'s `version` field; if it ends in a release-track suffix, use as-is. Otherwise ask the user for the upcoming version (e.g. "2.8.0").

Write a `SummaryInput` JSON to `/tmp/release-review/<run-id>/summary-input.json` (replacing `<run-id>` with the literal run-id from Phase 0):
```json
{
"runId": "<run-id>",
"version": "<resolved>",
"counts": { "high": <count>, "medium": <count> },
"decisions": [ ...recorded above... ]
}
```

Then run:
```bash
pnpm --filter ep_etherpad-lite exec tsx node/utils/releaseReview/cli.ts \
summary /tmp/release-review/<run-id>/summary-input.json \
"docs/reviews/<version>-summary.md"
```
(Replace `<run-id>` with the literal run-id from Phase 0.)

Print final instructions:
> Session complete. Suggested next step:
> git add docs/reviews/known-findings.yml docs/reviews/<version>-summary.md
> git commit -m "chore(reviews): triage <version> findings"
> Source edits applied during the session are unstaged; review with `git diff` and commit separately.

## Resume mode

When `--resume <run-id>` is passed:
- Skip Phase 1 + Phase 2.
- Verify `/tmp/release-review/<run-id>/` exists (replacing `<run-id>` with the supplied id); if not, fail with a clear message.
- Skip 3a/3b if `merged.json` and `triage.json` already exist; otherwise re-run them.
- Walk from where the user left off. (For now: walkthrough always restarts from the top of the buckets. The user should track in their head which they've handled, OR `git diff` will show which already have applied fixes — the second run will see those fixes as new code, breaking the fingerprint and dropping them naturally on re-aggregate.)

## Failure handling

- Phase 1 missing: warn, continue with Phase 2 only.
- Phase 2 subagent missing: warn (`"missing: <name>"`), continue.
- Malformed `known-findings.yml`: abort the session with the parser's error message and instructions to fix manually.
- Disk write fails to `/tmp/release-review/...`: surface the error and exit; the user's environment likely has /tmp constraints.
89 changes: 89 additions & 0 deletions docs/reviews/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# /release-review — operator guide

Periodic full-codebase Medium+ review for Etherpad releases.

- **Design**: `docs/superpowers/specs/2026-05-09-release-review-design.md`
- **Implementation plan**: `docs/superpowers/plans/2026-05-09-release-review.md`
- **Slash command**: `.claude/commands/release-review.md`
- **Helper modules**: `src/node/utils/releaseReview/`
- **Tests**: `src/tests/backend/specs/releaseReview-utils.ts`

## When to run

Once per major release version, e.g. before cutting `v2.8.0`. The intent is **periodic** rather than per-PR; CodeQL + dependency-review handle PR-time security signal.

## How to run

In Claude Code, on the `develop` branch:

```
/release-review
```

This runs the full three-phase review: tools → 4 parallel AI subsystem sweeps → live Medium+ walkthrough.

To resume a partially-completed session:

```
/release-review --resume run-2026-05-09-1
```

The run-id is printed at the start of every session.

## What it produces

- **In-session**: live walkthrough where you triage each Medium+ finding (fix / file issue / suppress).
- **Committed at end**:
- `docs/reviews/known-findings.yml` — appended with new suppression entries
- `docs/reviews/<version>-summary.md` — session summary
- **Source edits**: applied during Fix-now batch; review with `git diff` and commit separately.

## Suppression file — `docs/reviews/known-findings.yml`

Tracks findings already triaged. New entries are added by `/release-review` automatically. Re-triage by hand-editing:

- **Remove an entry** to make the finding resurface in the next run.
- **Change `status`** to reclassify.
- **Never hand-edit `fingerprint`** — it must come from a real run.

Schema:

```yaml
findings:
- fingerprint: <sha256>
status: wontfix | accepted-risk | deferred
ruleId: <tool or AI rule id> # optional
file: <repo-relative path> # optional
line: <int> # optional
decidedAt: <YYYY-MM-DD>
decidedInRun: <run-id>
rationale: <free text>
targetRelease: <version> # required iff status == deferred
```

## First run

Your first `/release-review` will surface every accumulated issue at once — likely 30–50 Medium+ findings. The session offers baseline-acceptance: bulk-mark everything as `accepted-risk` and only show new findings going forward. Re-triage at your leisure by editing the suppression file.

## Smoke test

Before each release, after the smoke check that follows the slash command, verify:

1. The run-dir exists: `ls /tmp/release-review/<run-id>/`
2. `merged.json` is present and valid JSON.
3. `triage.json` is present.
4. The summary file is written: `cat docs/reviews/<version>-summary.md`

If anything looks off, re-run with `--resume <run-id>` and inspect `merged.json` directly.

## Updating the prompts

Each Phase 2 prompt (`docs/reviews/prompts/*.md`) is intentionally a separate, diffable file. When a release surfaces a class of finding the prompts missed, edit the relevant prompt and commit alongside the new fix. Prompts are expected to evolve.

## Adding a new subsystem

To add a fifth subsystem subagent:

1. Add a prompt file at `docs/reviews/prompts/<name>.md` following the same structure.
2. Add it to the parallel-dispatch list in `.claude/commands/release-review.md`.
3. Update `bin/release-review` smoke tests if you cover the new subsystem there.
12 changes: 12 additions & 0 deletions docs/reviews/known-findings.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# /release-review suppression file.
# Entries are appended automatically by `/release-review` when the user
# triages a finding as wontfix / accepted-risk / deferred.
#
# Manual edits are supported for re-triaging:
# - Remove an entry to make the finding resurface in the next run.
# - Change `status` to reclassify.
# - DO NOT hand-edit `fingerprint` — it must come from a real run.
#
# See docs/reviews/README.md for the schema and triage workflow.

findings: []
Empty file added docs/reviews/prompts/.gitkeep
Empty file.
89 changes: 89 additions & 0 deletions docs/reviews/prompts/auth-sessions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Phase 2 — auth-sessions subagent

You are auditing Etherpad's authentication and session-management subsystem for Medium+ severity bugs and security issues. You will read a fixed set of files, identify problems, and emit a single JSON findings file. **Do not edit source.**

## Mission

Find Medium+ severity bugs, CVE-relevant patterns, and security hardening gaps in the assigned subsystem. **Do not report style, lint, or informational findings.** Those are handled by Phase 1.

## Scope (read only these globs, all relative to `{{repo_root}}`)

- `src/node/db/Session*.ts`
- `src/node/db/AuthorManager.ts`
- `src/node/db/SecurityManager.ts`
- `src/node/handler/PadAuth*.ts`
- `src/node/handler/ExpressAuth*.ts`
- `src/node/hooks/express/*auth*`
- `src/node/security/**`
- `src/node/utils/{authorTokenCookie,ensureAuthorTokenCookie,SecretRotator,crypto}.ts`

If a path doesn't match any files, skip silently.

## What to look for

Prioritize:

1. **Token leakage**: tokens or secrets logged, returned to clients, or stored in URLs / headers that get cached.
2. **Session fixation / hijacking**: session id reuse on privilege change, missing rotation on login.
3. **Missing CSRF protection**: state-changing endpoints without same-origin or token check.
4. **Timing-attack-prone comparisons**: `==` / `===` on secrets/tokens instead of `crypto.timingSafeEqual`.
5. **Auth bypass via plugin hooks**: hook contracts that allow plugins to short-circuit auth checks; missing return-value validation.
6. **OIDC / SSO claim handling**: trusting claims without verifying issuer/audience/expiry; treating missing fields as defaults.
7. **Cookie misconfig**: missing `HttpOnly`, `Secure`, `SameSite` on auth-bearing cookies.
8. **Logic bugs in author-erasure / GDPR endpoints**: data leaks across authors, missing tenant boundaries.

## Severity rubric (apply consistently)

- **High** — exploitable now (auth bypass, token leakage to wrong user, RCE shaped finding). CVE-equivalent.
- **Medium** — bug under realistic conditions (race exposes session, timing attack practical with realistic latency, hardening gap that compounds).
- **Low / info** — DO NOT REPORT.

## Output

Write to `/tmp/release-review/{{run_id}}/auth-sessions.json`:

```json
{
"findings": [<Finding>, ...],
"scope_summary": "<one-line: how many files scanned, lines reviewed>"
}
```

Each `Finding` (no fingerprint — added by aggregate stage):

```json
{
"source": "auth-sessions",
"severity": "high|medium",
"category": "cve|bug",
"file": "<repo-relative path>",
"line": <1-indexed integer>,
"ruleId": "auth-sessions.<short-slug>",
"message": "<one-line description>",
"remediationHint": "<short hint if you have one, otherwise omit>"
}
```

## Worked example

A finding might look like:

```json
{
"source": "auth-sessions",
"severity": "medium",
"category": "bug",
"file": "src/node/db/SecurityManager.ts",
"line": 142,
"ruleId": "auth-sessions.token-equality-non-constant-time",
"message": "Author token compared with === instead of crypto.timingSafeEqual; allows timing oracle on shared infrastructure.",
"remediationHint": "Use crypto.timingSafeEqual on Buffer.from(a) and Buffer.from(b) of equal length; reject early on length mismatch."
}
```

## Constraints

- Do NOT respond in chat; emit only the JSON file.
- Cap output at 30 findings. If you find more, prioritize highest-severity / most-confident.
- Be conservative: if you're not sure something is exploitable or wrong, leave it out. False positives waste reviewer time.
- When done, output a single line to stdout: `auth-sessions.json: N findings`.
Loading
Loading