Skip to content

[pull] main from danny-avila:main#116

Merged
pull[bot] merged 1 commit into
innFactory:mainfrom
danny-avila:main
Jun 19, 2026
Merged

[pull] main from danny-avila:main#116
pull[bot] merged 1 commit into
innFactory:mainfrom
danny-avila:main

Conversation

@pull

@pull pull Bot commented Jun 19, 2026

Copy link
Copy Markdown

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

…ed containers (#252)

* fix(cloudflare): client-side timeout on sandbox exec() to bound stalled containers

The native Cloudflare Sandbox DO exec() is effectively uncancellable from the host
(ExecOptions has no signal, so supportsExecSignal is false for the native transport)
and its `timeout` option is not enforced when the container/RPC itself stalls, while
the in-sandbox `timeout(1)` wrapper only bounds a command that is actually running.
So a stalled exec (an unresponsive / cold container) hangs until the host's run-level
abort, burning the entire run budget on a single tool call. The direct exec paths
(executeCloudflareBash / executeCloudflareCode) had no client-side timeout at all.

Add withClientTimeout(): a Promise.race backstop around every sandbox.exec() so the
host await settles within clientExecTimeoutMs() (outerTimeoutMs + 5s) regardless of
transport. On timeout the underlying exec may keep running in the DO (a native-DO
exec can't be truly cancelled), so its late settlement is swallowed to avoid an
unhandled rejection. The spawn path already timed out via spawnLocalProcess's kill
timer; this closes the gap for the direct exec paths.

Fixes #251.

* fix(cloudflare): address Codex review on the exec client-timeout

- unref the backstop timer so a timed-out / early-killed spawn doesn't keep the
  process/event loop alive until the timer fires (also fixes the jest open-handle
  warning) [Codex P2]
- fire-and-forget the executeCloudflareCode temp-dir cleanup so a fully-stalled
  sandbox doesn't add a second client timeout to the caller's latency [Codex P2]
- apply the client-side timeout to the bash_programmatic_tool_calling exec in
  CloudflareProgrammaticToolCalling.executeGeneratedCloudflareBash, which called
  sandbox.exec() directly and could still hang; export withClientTimeout +
  clientExecTimeoutMs to share them [Codex P2]

* fix(cloudflare): address Codex cycle-2 review

- unref the client-timeout backstop ONLY on the spawn path; the awaited direct-exec
  paths keep it ref'd so the timeout is guaranteed to fire even if nothing else is
  pending [Codex P2]
- await temp-dir cleanup after a successful code exec; only detach it (unref'd) on a
  stalled/failed exec so normal runs still clean up before returning [Codex P2]
- abort signal-aware execs on client timeout via a new execWithClientTimeout helper
  (creates a controller, passes signal when supportsExecSignal, aborts on timeout);
  applied to the direct bash/code + bash-programmatic paths [Codex P2]
- test: signal-aware exec is aborted when the client timeout fires

* fix(cloudflare): address Codex cycle-3 review

- in withClientTimeout, reject the client-timeout FIRST and only then run onTimeout
  (abort), so a signal-aware exec's resulting AbortError can't win the race and
  surface to the caller instead of the timeout message [Codex P2]
- route temp-dir cleanup through execWithClientTimeout (add an unref pass-through),
  so a stalled cleanup is actually aborted on signal-aware runtimes instead of
  leaking the socket until the bridge/network stack times it out [Codex P2]

* fix(cloudflare): address Codex cycle-4 — preserve caller abort signal

execWithClientTimeout overwrote a caller-provided options.signal with its private
timeout controller, so on signal-aware runtimes a run/user cancellation no longer
reached the exec until the client timeout fired. Now compose them: forward the
caller's signal to the timeout controller so EITHER source cancels the exec. Test
added for the composition.

* fix(cloudflare): address Codex cycle-5 review

- strip a caller-provided options.signal for native runtimes (supportsExecSignal
  false) so the {...options} spread can't reintroduce a signal the native DO RPC
  cannot clone/consume [Codex P2]
- remove the composed caller-abort listener in a finally after the exec settles, so
  reusing a long-lived/shared caller signal across execs doesn't leak listeners
  or trigger MaxListeners warnings [Codex P2]
- test: native runtime strips the caller signal
@pull pull Bot locked and limited conversation to collaborators Jun 19, 2026
@pull pull Bot added the ⤵️ pull label Jun 19, 2026
@pull pull Bot merged commit c98da8e into innFactory:main Jun 19, 2026
1 check passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant