flow do --auto: harden run-status writes + harness-generic env strip (follow-ups from #67 review)

Follow-ups from an independent review of #67 (`flow do --auto`). The feature shipped (admin-merged after @rr0hit's pre-rebase approval); these are non-blocking hardening items. None are user-visible regressions — the run-status tracking is self-healing via read-time pid reconciliation — but a few writes aren't strictly race-safe and the env strip is claude-specific. Recording them in one issue so they are not lost — unassigned for now (no owner; pick up when someone has bandwidth).

### 1. [high] Launch/finalize race can briefly mislabel a finished run
`flow do --auto` writes `auto_run_status='running'` (in `recordAutoRunLaunched`) *after* `launchAutoRun` starts the detached supervisor, which runs concurrently. If the harness exits almost instantly (e.g. `claude` not on PATH → immediate fail), the supervisor's `finalizeAutoRun('dead')` can land *before* the parent's record, which then clobbers the terminal row back to `running` with an already-dead pid. Reconciliation flips it back to `dead` on the next `show`/`list`, so it's self-healing, but there's a wrong-status window. (A `completed`→`dead` downgrade would require the run to *succeed* in microseconds, so that variant is theoretical.)
**Fix:** make finalize authoritative — guard `recordAutoRunLaunched` (and/or finalize) with a conditional `WHERE` on status/`auto_run_finished` so a finished run can't be resurrected to `running`. Keep `--force` relaunch working (clear the prior terminal state as part of the claim).

### 2. [medium] Completed run mislabeled `dead` on a transient DB error
In `cmdAutoExec`, after `autoRunner` returns nil the code re-reads the task to check `status=='done'`. If that `GetTask` errors (transient `SQLITE_BUSY`), it falls through to `dead` even though the run completed.
**Fix:** retry the read (DB already has a 30s busy_timeout) or treat read-failure as indeterminate rather than asserting `dead`.

### 3. [medium] TOCTOU on the "already running" guard
The auto in-flight guard (`reconcileAutoRun` + status check) runs before the status-flip tx with no lock, so two concurrent `flow do --auto <same-task>` can both pass it and spawn duplicate supervisors (last-writer-wins on pid; both write the same session jsonl). Low likelihood — launchd schedules distinct run-slugs — but the guard is advertised as a safety check and isn't truly race-safe.
**Fix:** fold the in-flight re-check into the status-flip transaction, or claim atomically in `recordAutoRunLaunched` (0 rows affected → lost the race → kill the just-spawned supervisor).

### 4. [low] `autoChildEnv` strips only claude's session-id env var
It drops `CLAUDE_CODE_SESSION_ID`, but `AutoRunArgv` is harness-generic and the supervisor resolves whatever harness the task is pinned to. A future codex/gemini adapter's session-id env (`CODEX_THREAD_ID`, `GEMINI_SESSION_ID`) would leak through, reintroducing the reverse-lookup confusion this strip prevents.
**Fix:** strip every `h.SessionIDEnvVar()` across `allHarnesses()`, not just claude's.

### 5. [low] `--force` auto-relaunch orphans a still-alive supervisor silently
When a prior run is genuinely still running and `--force` is passed, `recordAutoRunLaunched` overwrites the pid; the old supervisor keeps running and writing the same transcript, untracked. Intentional (mirrors interactive `--force`), but unlike the interactive path it emits no warning.
**Fix:** warn when `--force` overrides a still-alive auto run (name the orphaned old pid).

### 6. [low] pid-reuse window in reconciliation
A stale `running` row's pid could be reassigned after a reboot/long uptime, so reconciliation keeps it `running` (or matches a wrong pid). Inherent to pid-based liveness.
**Fix (optional):** also confirm the pid's argv mentions `__auto-exec <slug>` before treating it as alive (the harness already scans `ps`).

### Test gaps
- launch/finalize race (#1)
- `cmdAutoExec` GetTask-error downgrade (#2)
- concurrent `flow do --auto` TOCTOU (#3)
- `autoChildEnv` stripping assertion (#4)
- the real `autoLauncher` detachment path (`Setsid` / fd inheritance / `Process.Release` are fully stubbed today)

---
Source: independent post-rebase review of #67. Cleared as correct in the same review: the migration/column-ordering, `AutoRunArgv` (session pin + inject + skip-perms), `--with-file` threading, and the launch-failure rollback.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

flow do --auto: harden run-status writes + harness-generic env strip (follow-ups from #67 review) #68

1. [high] Launch/finalize race can briefly mislabel a finished run

2. [medium] Completed run mislabeled `dead` on a transient DB error

3. [medium] TOCTOU on the "already running" guard

4. [low] `autoChildEnv` strips only claude's session-id env var

5. [low] `--force` auto-relaunch orphans a still-alive supervisor silently

6. [low] pid-reuse window in reconciliation

Test gaps

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

flow do --auto: harden run-status writes + harness-generic env strip (follow-ups from #67 review) #68

Description

1. [high] Launch/finalize race can briefly mislabel a finished run

2. [medium] Completed run mislabeled dead on a transient DB error

3. [medium] TOCTOU on the "already running" guard

4. [low] autoChildEnv strips only claude's session-id env var

5. [low] --force auto-relaunch orphans a still-alive supervisor silently

6. [low] pid-reuse window in reconciliation

Test gaps

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

2. [medium] Completed run mislabeled `dead` on a transient DB error

4. [low] `autoChildEnv` strips only claude's session-id env var

5. [low] `--force` auto-relaunch orphans a still-alive supervisor silently