fix(agent): halt agent loop on first backend user-state failure (TAURI-RUST-5KG) by CodeGhost21 · Pull Request #3334 · tinyhumansai/openhuman

CodeGhost21 · 2026-06-04T07:39:04Z

Summary

The previous attempt at TAURI-RUST-5KG (#3318, closed) routed BackendUserStateError away from report_error so identical retries stopped flooding Sentry — silencing the alarm without putting out the fire. The actual bug is upstream: when web_search_tool hits 400 Insufficient balance, the agent retries 19× per turn because the underlying condition (empty wallet) is global and the existing (tool, args)-coupled breaker can't catch varied queries.
Real fix: stop the runaway. Embed a BACKEND_USER_STATE_MARKER in the failed tool result and halt the agent loop on the first occurrence with a "user must act" summary — same idea as the existing POLICY_BLOCKED_MARKER / POLICY_DENIED_MARKER pattern, but more aggressive because the condition can never be resolved by retry.
Route the (single) terminal capture through report_error_or_expected instead of silently skipping. The existing breadcrumb classifier already knows BackendUserError / BudgetExhausted / ProviderUserState → demoted to warn-level. Net: ~19 always-error captures per turn → 1 classified breadcrumb per turn, AND the agent stops retrying.

Problem

Sentry TAURI-RUST-5KG — tool=web_search_tool, operation=execute, iteration=19, ~1860 events / 9 users on Windows releases 0.56.0 → 0.57.13. The closed PR #3318 correctly identified that the integrations breadcrumb classifier already handled Insufficient balance as a user-state error, then flattened the classification when IntegrationClient::{post,get} bailed with anyhow!. It typed the error at the boundary (BackendUserStateError) and made run_one_tool downcast.

But the closed PR's chosen exit was to skip report_error for the user-state case. That makes Sentry quiet — and leaves the agent retrying 19 times because the per-(tool, args) REPEAT_FAILURE_THRESHOLD = 3 doesn't trip when the LLM varies the search query, and the consecutive NO_PROGRESS_FAILURE_THRESHOLD = 6 resets on any interleaved success. The user objected that "ignoring an error is not fixing it" — they are right; the bug is the retry loop, not the captured event count.

Solution

Three coordinated changes, building on the typed BackendUserStateError boundary work from the closed PR (cherry-picked into the first commit of this PR so the boundary classification remains the source of truth):

`src/openhuman/agent/harness/tool_loop.rs`

New pub(crate) const BACKEND_USER_STATE_MARKER: &str = "[backend-user-state]". Module docstring spells out why it halts on first occurrence (unlike HARD_REJECT_REPEAT_THRESHOLD, which lets the model see one error and pivot) — the condition is global, retries with different args or different paid tools can't help, only the user can.
RepeatFailureGuard::record now checks for the marker before the existing (tool, args) repeat-threshold gate and the hard-reject branch. When present and success == false, it returns a halt summary on the very first occurrence with the actionable error preserved.

`src/openhuman/agent/harness/engine/tools.rs`

Ok(Err(e)) branch: when is_backend_user_state_error(&e) is true, prefix the LLM-visible result text with BACKEND_USER_STATE_MARKER (so the breaker sees it) AND call report_error_or_expected instead of report_error. The classifier demotes BackendUserError / BudgetExhausted / ProviderUserState to warn-level — Sentry still sees one breadcrumb per turn (we're not suppressing observability), just classified correctly. System / 5xx / non-classifiable errors continue using report_error with the always-capture path, unchanged.
New outcome=failed_user_state tag distinguishes the demoted path from outcome=failed in dashboards.

`src/openhuman/agent/harness/tool_loop_tests.rs`

Four new RepeatFailureGuard tests, all in the same style as the existing hard_reject_* tests:

Test	Pins
`backend_user_state_marker_halts_on_first_occurrence`	first-occurrence halt + halt-summary shape (labels failure class, tells model to surface, preserves root cause)
`backend_user_state_marker_halts_regardless_of_args`	not coupled to `(tool, args)` repetition (the closed-PR shortcoming)
`backend_user_state_marker_takes_precedence_over_generic_threshold`	ordering: marker check fires before the count-based gates, so `count=1 < REPEAT_FAILURE_THRESHOLD=3` still trips
`backend_user_state_unmarked_failures_use_normal_threshold`	regression guard: ordinary failures still need 3 identical retries to trip (no behaviour widening)

Design notes

Marker pattern was chosen over extending ToolRunResult with a typed failure_kind enum so the contract mirrors the existing POLICY_BLOCKED_MARKER / POLICY_DENIED_MARKER convention exactly. The breaker already string-scans for those; adding a third marker is a 6-line change.
The marker is added in run_one_tool's Ok(Err(e)) branch — the same branch the integrations tools surface through via ?. Other tool paths (e.g. tools that wrap errors as Ok(ToolResult { is_error: true, … })) are untouched here; they don't carry the typed BackendUserStateError because the type info is already lost by the time they format the body, and the existing thresholds catch their identical retries fine. If a future tool needs the typed marker, the fix is at its boundary, not here.
Halting on first occurrence is intentionally more aggressive than HARD_REJECT_REPEAT_THRESHOLD = 2. Hard rejects let the model see one block and pivot (the LLM might try a different file, different command). For backend-user-state, pivot is futile — the wallet is empty for all paid tools. The halt summary explicitly tells the model to surface, not retry.
No agent-side suppression. report_error_or_expected runs the same classifier the integrations client already used at the breadcrumb site; if the body somehow doesn't classify (network bug, malformed wrap), it falls through to report_error_message and Sentry sees the unclassified event. Defense in depth — no silent black-hole.

Submission Checklist

Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
Diff coverage ≥ 80% — 4 new RepeatFailureGuard tests pin the first-occurrence halt path, the (tool, args) decoupling, the precedence ordering vs. the generic threshold, and the negative-case regression guard. 7 client tests from the cherry-picked typed-error commit cover the four wrapped bail sites + Display preservation + chain-walk discrimination + 5xx negative.
N/A: behaviour-only change — Coverage matrix updated — added/removed/renamed feature rows in docs/TEST-COVERAGE-MATRIX.md (no new user-visible feature; tightens an existing safety net)
N/A: behaviour-only change — All affected feature IDs from the matrix are listed in the PR description under ## Related
No new external network dependencies introduced (mock backend used per Testing Strategy)
N/A: not a release-cut surface — Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md)
N/A: Sentry-tracked issue (TAURI-RUST-5KG), no GitHub issue — Linked issue closed via Closes #NNN in the ## Related section

Impact

Runtime: desktop core only — src/openhuman/agent/harness/tool_loop.rs + src/openhuman/agent/harness/engine/tools.rs + the cherry-picked src/openhuman/integrations/client.rs typed-error boundary. No frontend, no Tauri shell, no schema changes.
User-visible: agents that previously ground through 19 retries on an empty-wallet web_search_tool call now stop on the first failure and surface a "out of credits — please add credits" message to the user. Identical error text still flows through; the difference is the loop terminating immediately instead of generating 18 more identical failures.
Performance: negligible — one extra is_backend_user_state_error(&e) downcast and one extra result.contains(BACKEND_USER_STATE_MARKER) string scan per failed tool call. Both run only on the error path.
Security/migration/compat: none. The marker prefix is purely additive in the LLM-visible result text and matches an existing string convention. Existing callers that scan tool results don't recognise the marker; they treat it as an opaque error prefix exactly as they treat [policy-blocked] today.

Closes: N/A (Sentry TAURI-RUST-5KG, no linked GitHub issue)
Supersedes: #3318 (closed — silenced Sentry rather than fixing the runaway retry; the typed-error commit from that PR is preserved here as the first commit because the boundary classification is still correct)
Follow-up PR(s)/TODOs: a backend-driven pre-flight balance check could prevent the agent from picking web_search_tool at all when the wallet is empty — strictly orthogonal to this PR, which addresses the retry loop directly.

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Key: N/A
URL: N/A (Sentry TAURI-RUST-5KG)

Commit & Branch

Branch: fix/sentry-tauri-rust-5kg-halt-on-user-state
Commit SHA: see PR head

Validation Run

N/A: Rust-only change — pnpm --filter openhuman-app format:check
N/A: Rust-only change — pnpm typecheck
Focused tests: cargo test --manifest-path Cargo.toml --lib agent::harness::tool_loop (26/26 pass, including 4 new), cargo test --manifest-path Cargo.toml --lib integrations::client (26/26 pass, no regression from cherry-picked typed-error commit), cargo test --manifest-path Cargo.toml --lib agent::harness (423/423 pass), cargo test --manifest-path Cargo.toml --lib observability:: (142/142 pass, classifier contract preserved)
Rust fmt/check (if changed): cargo fmt --manifest-path Cargo.toml -- --check clean; cargo check --manifest-path Cargo.toml --tests clean
N/A: shell not touched — Tauri fmt/check (if changed)

Validation Blocked

command: N/A
error: N/A
impact: N/A

Behavior Changes

Intended behavior change: agents that hit a typed BackendUserStateError from the integrations client (insufficient balance, missing required field, toolkit not enabled) now halt the loop on the first occurrence with a "user must act" summary, instead of grinding through up to 19 retries before the iteration cap.
User-visible effect: terminal "out of credits / disabled toolkit / sign in" failures surface immediately to the user instead of being preceded by 18 indistinguishable retry events. Error messages themselves are unchanged.

Parity Contract

Legacy behavior preserved: non-backend-user-state failures continue to use the existing 3-attempt REPEAT_FAILURE_THRESHOLD, 2-attempt HARD_REJECT_REPEAT_THRESHOLD, and 6-consecutive NO_PROGRESS_FAILURE_THRESHOLD gates verbatim. The marker check is added before those gates; when absent, control falls through and behaviour is unchanged.
Guard/fallback/dispatch parity checks: backend_user_state_unmarked_failures_use_normal_threshold pins that ordinary failures still need 3 identical retries (no widening); hard_reject_blocked_halts_on_first_repeat_not_third and hard_reject_denied_halts_on_first_repeat continue to pass unchanged, confirming the marker check doesn't perturb the existing hard-reject paths.

Duplicate / Superseded PR Handling

Duplicate PR(s): fix(integrations): type backend user-state errors so tool runner skips Sentry (TAURI-RUST-5KG) #3318 (closed)
Canonical PR: this PR
Resolution: supersedes — the closed PR's typed-error boundary work is preserved as the first commit; the additional commit replaces the silence-Sentry exit with halt-on-first + classifier-aware capture.

Summary by CodeRabbit

Bug Fixes
- Improved handling of deterministic backend "user must act" failures (e.g., insufficient balance, toolkit disabled): operations now halt immediately on first detection and present a clear, actionable error instead of repeated retries.
- Preserves the original actionable message while preventing repeated retry loops.
Tests
- Added unit and integration tests to validate classification, halt-on-first-occurrence behavior, and correctness of user-facing error text.

…s Sentry (TAURI-RUST-5KG) `web_search_tool` flooded Sentry with 1860 events / 9 users for backend 400 "Insufficient balance" — the agent retried 19 times in a single turn, and each `Ok(Err(e))` in `agent::harness::engine::tools::run_one_tool` called the always-capture `report_error`. The integrations breadcrumb classifier already knew "Insufficient balance" was a user-state error (see tinyhumansai#2809 / TAURI-RUST-4ZF), but that distinction was lost at the `?` boundary because `IntegrationClient::{post,get}` flattened every non-2xx into an opaque `anyhow::Error`. Fix it at the architectural boundary that knows the difference: - Add `BackendUserStateError` (Display + std::error::Error) and the `is_backend_user_state_error(&anyhow::Error)` helper to the integrations client. Wrap the four bail sites (post/get × non-2xx / envelope-failure) with the typed marker when `expected_error_kind` classifies the message as `BackendUserError`, `BudgetExhausted`, or `ProviderUserState`. Display string is preserved verbatim, so the ~30 existing callers that stringify the error see no behaviour change. - In `agent::harness::engine::tools::run_one_tool`, downcast the bubbled error: when the typed marker is present, log a warn breadcrumb (no Sentry capture) and surface the message to the LLM via the existing `(text, false)` failure tuple. The circuit breaker still records the failure so 19 retries with the same user-state message trip out. Real failures (5xx, transport bugs, unknown envelope shapes) keep their `report_error` Sentry path unchanged. Scope: every tool that calls the integrations backend benefits — search/parallel/tinyfish, all composio, twilio, apify, google_places, stock_prices, web3, linkedin_enrichment — not just `web_search_tool`. Tests: 7 new cases in `client_tests.rs` pin both halves of the contract: typed wrapping fires for 400/insufficient-balance, 403/toolkit-not-enabled, and 2xx-envelope user-state failures; 500s remain un-typed; Display verbatim; `is_backend_user_state_error` walks the chain so `.context()` wraps don't silently re-enable capture.

…I-RUST-5KG) The previous attempt (tinyhumansai#3318, closed) routed `BackendUserStateError` away from `report_error` so identical retries didn't flood Sentry. That silenced the alarm but left the actual bug — the agent kept retrying a terminally-failing tool 19 times per turn — running. Real fix: 1. Embed a stable `BACKEND_USER_STATE_MARKER` in the tool result text whenever `run_one_tool` sees a `BackendUserStateError` propagated from the integrations client. The marker rides through `ToolRunResult.text` exactly like the existing `POLICY_BLOCKED_MARKER` / `POLICY_DENIED_MARKER` pattern. 2. Extend `RepeatFailureGuard::record` to halt on the **first** occurrence of the marker — not the second identical attempt (`HARD_REJECT_REPEAT_THRESHOLD = 2`), not the third (`REPEAT_FAILURE_THRESHOLD = 3`), not the sixth consecutive (`NO_PROGRESS_FAILURE_THRESHOLD = 6`). The underlying condition (empty wallet, disabled toolkit, expired session) is *global* — varying the query or pivoting to a different paid tool cannot resolve it. Halt immediately with a clear "user must act" summary so the agent reports back instead of grinding. 3. Replace the closed PR's silent skip with `report_error_or_expected`. The existing breadcrumb classifier demotes the `BackendUserError` / `BudgetExhausted` / `ProviderUserState` buckets to a single warn-level breadcrumb — observability still sees one classified event per turn instead of 19 always-error captures, but errors are not *suppressed*. Net effect on TAURI-RUST-5KG (~1860 Sentry hits / 9 users): ~19 always-captured errors per turn → 1 classifier-demoted breadcrumb per turn, AND the agent stops retrying and surfaces the actionable message to the user. Includes 4 new `RepeatFailureGuard` tests pinning: halt on first occurrence; halt regardless of `(tool, args)`; precedence over the generic 3-attempt threshold; unmarked failures still use the 3-attempt threshold (no behaviour widening).

coderabbitai · 2026-06-04T07:39:23Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 558ec2c5-afba-49e3-8478-7cfce5ef40aa

📥 Commits

Reviewing files that changed from the base of the PR and between 63332ee and 7c6e883.

📒 Files selected for processing (1)

src/openhuman/agent/harness/tool_loop.rs

🚧 Files skipped from review as they are similar to previous changes (1)

src/openhuman/agent/harness/tool_loop.rs

📝 Walkthrough

Walkthrough

This PR classifies backend "user-state" failures at the integration layer, returns a typed marker error, embeds a stable marker into tool error text in run_one_tool, and makes the RepeatFailureGuard halt immediately when that marker is seen.

Changes

Backend user-state marker and halt mechanism

Layer / File(s)	Summary
User-state error classification and type definition `src/openhuman/integrations/client.rs`, `src/openhuman/integrations/mod.rs`	Adds `BackendUserStateError`, `classify_as_user_state`, and `is_backend_user_state_error`; re-exports the new symbols.
Integration request failure handling and tests `src/openhuman/integrations/client.rs`, `src/openhuman/integrations/client_tests.rs`	`IntegrationClient::post`/`get` build a `bail_message`, classify failures, and return typed `BackendUserStateError` for matching HTTP/envelope error cases; tests cover 400/403/500, envelope `success:false`, display format, and wrapped error detection.
Tool execution: embed marker on user-state errors `src/openhuman/agent/harness/engine/tools.rs`	`run_one_tool` detects typed backend user-state errors, reports them via `report_error_or_expected`, and embeds `BACKEND_USER_STATE_MARKER` into the returned tool error text; non-user-state errors use existing reporting and generic messages.
Tool loop circuit breaker with marker detection and halt `src/openhuman/agent/harness/tool_loop.rs`, `src/openhuman/agent/harness/tool_loop_tests.rs`	Adds `BACKEND_USER_STATE_MARKER` constant and updates `RepeatFailureGuard::record` to remove the marker and return an immediate `Some("Stopping: ...")` halt summary when present. Tests verify first-occurrence halt, argument-independence, precedence over generic repeat thresholds, and unchanged behavior for unmarked failures.

Sequence Diagrams

sequenceDiagram
    participant Backend as Backend Response
    participant IntClient as IntegrationClient
    participant Classify as classify_as_user_state
    participant RunTool as run_one_tool
    participant ToolLoop as RepeatFailureGuard

    Backend->>IntClient: !status.is_success() or success:false
    IntClient->>Classify: classify_as_user_state(message)
    alt is user-state failure
        Classify-->>IntClient: BackendUserStateError
        IntClient-->>RunTool: Err(user_state)
        RunTool->>RunTool: is_backend_user_state_error?
        RunTool->>RunTool: embed BACKEND_USER_STATE_MARKER
        RunTool-->>ToolLoop: tool error text + marker
    else non-user-state
        Classify-->>IntClient: None
        IntClient-->>RunTool: Err(generic)
        RunTool-->>ToolLoop: tool error text (no marker)
    end

    ToolLoop->>ToolLoop: detect marker in text?
    alt marker present
        ToolLoop->>ToolLoop: halt immediately
        ToolLoop-->>ToolLoop: return Some(Stopping: ...)
    else no marker
        ToolLoop->>ToolLoop: apply retry threshold
        ToolLoop-->>ToolLoop: return None or continue
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

tinyhumansai/openhuman#1795: Related to expected_error_kind and envelope success:false routing that affect user-state classification.
tinyhumansai/openhuman#3012: Overlaps changes to the TurnEngine and the centralized tool executor touched by run_one_tool.

Suggested reviewers

oxoxDev
graycyrus

Poem

🐇 A tiny marker I did find,
tucked in an error, neatly signed.
"User must act" — the loop now knows,
it stops at once where caution grows.
Hops of code, a tidy bind.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely describes the main fix: halting the agent loop on first backend user-state failure, which is the core objective of all changes across five modified files.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/agent/harness/tool_loop.rs`:
- Around line 143-151: The halt message currently uses truncate_for_halt(result)
which may contain the internal token "[backend-user-state]"; strip or remove
that marker from the truncated text before formatting the user-visible halt
reason. In the block that constructs the message (the return Some(format!(...))
in tool_loop.rs), take the output of truncate_for_halt(result), remove the exact
substring "[backend-user-state]" (or use a safe replace/strip function) and then
interpolate the cleaned string into the format call so no internal routing token
is leaked to users.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: eb8570c7-81eb-4cb2-8dad-ab1141edfdd1

📥 Commits

Reviewing files that changed from the base of the PR and between 87a91ae and 97bb991.

📒 Files selected for processing (6)

src/openhuman/agent/harness/engine/tools.rs
src/openhuman/agent/harness/tool_loop.rs
src/openhuman/agent/harness/tool_loop_tests.rs
src/openhuman/integrations/client.rs
src/openhuman/integrations/client_tests.rs
src/openhuman/integrations/mod.rs

YellowSnnowmann

Review

Walkthrough

This PR correctly addresses TAURI-RUST-5KG at the root: the runaway retry loop (not just the Sentry noise). Typing the error at the HTTP boundary, embedding BACKEND_USER_STATE_MARKER, and halting RepeatFailureGuard on first occurrence is the right architecture — it mirrors the existing POLICY_BLOCKED_MARKER/POLICY_DENIED_MARKER pattern cleanly. Design is solid, CI is fully green (all 20 checks pass), and the test coverage is thorough.

One open issue needs to be addressed before merge — it's the same item CodeRabbit flagged.

🔴 Blocker — `src/openhuman/agent/harness/tool_loop.rs:151`

Internal routing token leaks into the user/LLM-visible halt message.

At the point the halt branch fires, result is:

[backend-user-state] Error executing web_search_tool: Backend returned 400 …: Insufficient balance

Passing it directly to truncate_for_halt(result) embeds the raw [backend-user-state] token in the message the model (and user) sees. The marker is an internal routing token — it has no meaning outside of RepeatFailureGuard and will read as noise or cause confusion.

Fix:

// before
if result.contains(BACKEND_USER_STATE_MARKER) {
    return Some(format!(
        "Stopping: the `{tool}` call returned a backend user-state error \
         — this is a deterministic condition that requires user action …\
         Reason:\n{}\n\n…",
        truncate_for_halt(result),
    ));
}

// after
if result.contains(BACKEND_USER_STATE_MARKER) {
    let clean_reason = result
        .replace(BACKEND_USER_STATE_MARKER, "")
        .trim()
        .to_string();
    return Some(format!(
        "Stopping: the `{tool}` call returned a backend user-state error \
         — this is a deterministic condition that requires user action …\
         Reason:\n{}\n\n…",
        truncate_for_halt(&clean_reason),
    ));
}

🟡 Minor — `tool_loop_tests.rs` — missing negative assertion in `backend_user_state_marker_halts_on_first_occurrence`

The three existing assert!(msg.contains(…)) checks pass even when the marker is present in the output. Once the fix above is applied, add a negative assertion to lock in the clean-output behavior:

assert!(
    !msg.contains(BACKEND_USER_STATE_MARKER),
    "internal routing token must not leak into user-visible halt message; got: {msg}"
);

Verified / looks good

classify_as_user_state correctly narrows to BackendUserError / BudgetExhausted / ProviderUserState — transport and session error kinds are excluded, keeping existing retry/re-auth machinery intact ✅
is_backend_user_state_error chain-walks anyhow so future .context() wraps don't silently re-enable Sentry capture ✅
All 4 bail sites in client.rs (POST non-2xx, POST envelope, GET non-2xx, GET envelope) covered ✅
Display impl on BackendUserStateError is identical to the original anyhow::bail! string — additive, no stringify callers broken ✅
backend_user_state_unmarked_failures_use_normal_threshold is the critical regression guard — ordinary 3-attempt threshold untouched ✅
outcome=failed_user_state Sentry tag is a clean dashboard win ✅

Just the one fix in tool_loop.rs + the test pin and this is ready.

@YellowSnnowmann

…ssage The internal routing token was passed through truncate_for_halt(result) verbatim into the LLM/user-visible halt summary. Strip it with .replace() + .trim() before formatting so the message reads cleanly without the [backend-user-state] prefix. Also pins the clean-output invariant with a negative assertion in backend_user_state_marker_halts_on_first_occurrence — addresses @coderabbitai and @YellowSnnowmann review (TAURI-RUST-5KG).

coderabbitai

🧹 Nitpick comments (1)

src/openhuman/agent/harness/tool_loop.rs (1)
142-156: ⚡ Quick win

Consider adding debug-level logging when the marker triggers a halt.

The circuit breaker has multiple halt conditions (backend user-state, hard rejects, repeat threshold, no-progress) but only the caller logs a generic "circuit breaker tripped" message. Adding a debug log here when the marker is detected would help distinguish marker-based halts from other rules during troubleshooting, especially given the global-halt semantic is new.
🔍 Example logging addition
         if result.contains(BACKEND_USER_STATE_MARKER) {
+            tracing::debug!(
+                tool,
+                "[RepeatFailureGuard] backend user-state marker detected — halting on first occurrence"
+            );
             let clean_reason = result
                 .replace(BACKEND_USER_STATE_MARKER, "")
                 .trim()
As per coding guidelines: "Add substantial debug-level logs while implementing features or fixes in Rust using log / tracing crate with stable prefixes and correlation fields (request IDs, method names, entity IDs) for grep-friendly tracing"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/agent/harness/tool_loop.rs` around lines 142 - 156, The
BACKEND_USER_STATE_MARKER branch in the tool-halt logic (inside the if
result.contains(BACKEND_USER_STATE_MARKER) block) should emit a debug-level log
before returning so marker-based halts are distinguishable; add a
tracing::debug! (or log::debug!) call that logs a stable prefix like
"circuit_breaker:backend_user_state" and includes correlation fields (request
ID, method / handler name, tool name via `tool`, and a truncated reason using
`truncate_for_halt(&clean_reason)`) so operators can grep and correlate these
events; place this single debug call immediately after computing `clean_reason`
and before the existing return.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/openhuman/agent/harness/tool_loop.rs`:
- Around line 142-156: The BACKEND_USER_STATE_MARKER branch in the tool-halt
logic (inside the if result.contains(BACKEND_USER_STATE_MARKER) block) should
emit a debug-level log before returning so marker-based halts are
distinguishable; add a tracing::debug! (or log::debug!) call that logs a stable
prefix like "circuit_breaker:backend_user_state" and includes correlation fields
(request ID, method / handler name, tool name via `tool`, and a truncated reason
using `truncate_for_halt(&clean_reason)`) so operators can grep and correlate
these events; place this single debug call immediately after computing
`clean_reason` and before the existing return.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 081fc44c-3355-4b45-a464-7c6b54c1f259

📥 Commits

Reviewing files that changed from the base of the PR and between 97bb991 and 63332ee.

📒 Files selected for processing (2)

src/openhuman/agent/harness/tool_loop.rs
src/openhuman/agent/harness/tool_loop_tests.rs

… fires Adds a tracing::debug! call at [circuit_breaker:backend_user_state] so marker-triggered halts are distinguishable from generic repeat-threshold or no-progress halts in operator traces — addresses CodeRabbit nitpick on TAURI-RUST-5KG.

senamakel · 2026-06-04T14:20:28Z

as we spoke. serious errors like out of balance need to fail loudly.

CodeGhost21 added 2 commits June 4, 2026 12:52

CodeGhost21 requested a review from a team June 4, 2026 07:39

coderabbitai Bot added rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. sentry-traced-bug Bug identified via Sentry triage bug labels Jun 4, 2026

coderabbitai Bot requested changes Jun 4, 2026

View reviewed changes

Comment thread src/openhuman/agent/harness/tool_loop.rs

CodeGhost21 mentioned this pull request Jun 4, 2026

fix(cron): halt agent-job retry on backend session-expired (TAURI-RUST-N) #3350

Merged

12 tasks

YellowSnnowmann reviewed Jun 4, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 4, 2026

View reviewed changes

coderabbitai Bot previously approved these changes Jun 4, 2026

View reviewed changes

CodeGhost21 dismissed coderabbitai[bot]’s stale review via 7c6e883 June 4, 2026 12:11

coderabbitai Bot approved these changes Jun 4, 2026

View reviewed changes

senamakel closed this Jun 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): halt agent loop on first backend user-state failure (TAURI-RUST-5KG)#3334

fix(agent): halt agent loop on first backend user-state failure (TAURI-RUST-5KG)#3334
CodeGhost21 wants to merge 4 commits into
tinyhumansai:mainfrom
CodeGhost21:fix/sentry-tauri-rust-5kg-halt-on-user-state

CodeGhost21 commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

YellowSnnowmann left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

senamakel commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

CodeGhost21 commented Jun 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

src/openhuman/agent/harness/tool_loop.rs

src/openhuman/agent/harness/engine/tools.rs

src/openhuman/agent/harness/tool_loop_tests.rs

Design notes

Submission Checklist

Impact

Related

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Commit & Branch

Validation Run

Validation Blocked

Behavior Changes

Parity Contract

Duplicate / Superseded PR Handling

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

YellowSnnowmann left a comment

Choose a reason for hiding this comment

Review

Walkthrough

🔴 Blocker — src/openhuman/agent/harness/tool_loop.rs:151

🟡 Minor — tool_loop_tests.rs — missing negative assertion in backend_user_state_marker_halts_on_first_occurrence

Verified / looks good

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

senamakel commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CodeGhost21 commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

`src/openhuman/agent/harness/tool_loop.rs`

`src/openhuman/agent/harness/engine/tools.rs`

`src/openhuman/agent/harness/tool_loop_tests.rs`

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading

🔴 Blocker — `src/openhuman/agent/harness/tool_loop.rs:151`

🟡 Minor — `tool_loop_tests.rs` — missing negative assertion in `backend_user_state_marker_halts_on_first_occurrence`