fix(channels): distinguish rate-limit sources in chat error classifier (#2364) by CodeGhost21 · Pull Request #2371 · tinyhumansai/openhuman

CodeGhost21 · 2026-05-20T20:04:10Z

Summary

classify_inference_error in web.rs used to tag any error string containing "rate limit" as rate_limited, including the SecurityPolicy hourly-cap strings emitted by built-in tools — so users saw "Your AI provider is rate-limiting you" copy for an OpenHuman-local safety cap.
Split the catch-all branch into three sources (action_budget_exceeded, max_iterations, rate_limited) and surface Retry-After seconds in the user-facing message when the upstream supplies them.
No frontend change needed — the new copy flows through the existing chat_error.message forwarder in ChatRuntimeProvider.

Problem

Issue #2364 reports that one chat thread shows "You're being rate-limited" while a fresh thread works. Tracing the frontend (ChatRuntimeProvider.onError) showed no per-thread sticky state — inferenceStatus, inferenceTurnLifecycle, and activeThreadId all clear on chat_error. The stickiness is perceived, driven by:

Generic copy that doesn't say "you can retry in this thread".
No retry-after window so the user has no idea when to try again.
The same generic copy used regardless of whether the limit came from the upstream LLM provider, the SecurityPolicy budget on tool calls, or the agent-loop max-iterations cap.

Solution

Three new / reshaped branches in classify_inference_error:

Source	Classification	User-facing copy
SecurityPolicy hourly action budget (web_fetch / curl / http_request / etc.)	`action_budget_exceeded`	"You've hit OpenHuman's per-hour action budget — this is a local safety cap, not your AI provider. The window decays gradually; you can keep chatting in this thread …"
Agent-loop max-iterations cap ("Agent exceeded maximum tool iterations")	`max_iterations`	"The agent ran the maximum number of tool steps for one turn … You can retry the same question in this thread once the underlying limit clears."
Upstream provider 429 / "rate limit"	`rate_limited` (enriched)	"Your AI provider is rate-limiting requests. This is a transient upstream limit, not a thread-level block — you can retry in this thread. Try again in 30 seconds." — retry-after parsed from `Retry-After:` / `retry_after:` header bodies, fractional values rounded up.

The SecurityPolicy and max-iterations checks run BEFORE the provider-429 branch so their substring rate limit no longer leaks into the wrong bucket. Existing rate_limited callers (FE handler, tests) continue to receive the same error_type token; only the human-readable message changed.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason.

Tests added or updated (happy path + at least one failure / edge case) — 5 new tests + 2 pre-existing classifier tests; cargo test --lib classify_inference_error → 7 passed.
Diff coverage ≥ 80% — every new branch and the new retry-after helper has at least one focused test. No untested branches added.
Coverage matrix updated — N/A: behaviour-only change to existing classifier output; no new feature row added/removed/renamed.
All affected feature IDs from the matrix are listed in ## Related — N/A: no matrix row touched.
No new external network dependencies introduced — pure string classification, no IO.
Manual smoke checklist updated if this touches release-cut surfaces — N/A: error-message wording only; happy-path chat behaviour is unchanged.
Linked issue closed via Closes #NNN in the ## Related section — see below.

Impact

Runtime/platform: backend chat error path (channels::providers::web::classify_inference_error).
User-visible: when the agent hits a SecurityPolicy budget cap, the chat surface now says "OpenHuman's per-hour action budget" instead of mislabeling the cap as an upstream provider 429. Provider 429s now include a parsed retry-after window when the upstream supplies one.
Performance / security: zero runtime cost (string classification only). No new error data is exposed; the retry-after parser only reads what's already in the error body.
Migration / compatibility: error_type tokens are unchanged for upstream consumers; only the message body is reworded. FE error forwarder picks up the new copy automatically.

Tests

cargo test --lib classify_inference_error → 7 passed, 0 failed (5 new, 2 pre-existing).

New test	Covers
`classify_inference_error_distinguishes_action_budget_from_provider_429`	all three SecurityPolicy strings classify as `action_budget_exceeded`; copy says "local safety cap" + "can keep chatting in this thread"
`classify_inference_error_max_iterations_gets_dedicated_branch`	flattened web-channel error wrapper resolves to `max_iterations` with the same-thread recovery hint
`classify_inference_error_rate_limited_surfaces_retry_after_seconds`	30-second retry-after appears verbatim; thread-recovery hint present
`classify_inference_error_rate_limited_no_retry_after_omits_hint`	429 without Retry-After does NOT hallucinate a window
`classify_inference_error_rate_limited_handles_fractional_and_minute_windows`	2.4s rounds to 3s; 180s renders as "about 3 minutes"

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Key: N/A
URL: N/A (GitHub-only issue Rate limit state sticks to one chat thread #2364)

Commit & Branch

Branch: fix/2364-rate-limit-classification (branched from origin/main after fresh fetch)
Commit SHA: see PR head

Validation Run

pnpm --filter openhuman-app format:check — N/A: no frontend changes.
pnpm typecheck — N/A: no TypeScript changes.
Focused tests: cargo test --lib classify_inference_error → 7 passed, 0 failed.
Rust fmt/check (if changed): cargo fmt --manifest-path Cargo.toml applied; cargo check clean.
Tauri fmt/check (if changed): N/A: no Tauri shell changes.

Validation Blocked

command: N/A
error: N/A
impact: N/A

Behavior Changes

Intended behavior change: error classification distinguishes OpenHuman SecurityPolicy budget caps from upstream provider 429s; retry-after seconds appear in the user-facing message when parseable.
User-visible effect: chat error bubbles for rate-limit-style errors now name the actual cause and (where known) a concrete retry window.

Parity Contract

Legacy behavior preserved: error_type tokens are unchanged for the existing rate_limited consumers; only message body changed. Generic-inference fallback unchanged. Order of remaining branches (timeout / auth / budget / provider / model_unavailable / config-rejection) is unchanged.
Guard/fallback/dispatch parity checks: the new SecurityPolicy / max-iterations branches are ordered BEFORE the generic 429 branch so their substring rate limit no longer leaks into the wrong bucket — locked in by _distinguishes_action_budget_from_provider_429 and _max_iterations_gets_dedicated_branch.

Duplicate / Superseded PR Handling

Duplicate PR(s): none.
Canonical PR: this PR.
Resolution: N/A.

Summary by CodeRabbit

Bug Fixes
- Improved error messages when action budget is exhausted, clarifying that the limit is local and the thread remains usable
- Enhanced rate-limiting responses with parsed Retry-After timing information to provide clearer retry guidance
- Better error handling and messaging for agent tool iteration limit exceeded scenarios
Tests
- Added comprehensive tests verifying correct error classification and user messaging across multiple failure scenarios

tinyhumansai#2364) User-perception bug from tinyhumansai#2364: any agent-loop error string containing "rate limit" was classified as `rate_limited` and the user saw "You're being rate-limited. Please wait a moment and try again." — copy that implies the AI provider is throttling them and gives no hint about which thread is affected or how long to wait. When the real cause was the SecurityPolicy hourly action cap on built-in tools (web_fetch / curl / http_request), the message was misleading; users opened new threads, saw those "work" (because the new thread didn't trigger a tool-call storm), and concluded the original thread was "stuck". Fix: split the catch-all rate-limit branch in `classify_inference_error` into three sources and surface retry-after when available. Classifier (src/openhuman/channels/providers/web.rs) - `action_budget_exceeded` (new): catches the SecurityPolicy strings emitted by the built-in tools (`Rate limit exceeded: action budget exhausted`, `Rate limit exceeded: too many actions in the last hour`, `Action blocked: rate limit exceeded`). User-facing copy: *"You've hit OpenHuman's per-hour action budget — this is a local safety cap, not your AI provider. The window decays gradually; you can keep chatting in this thread …"*. Ordered BEFORE the provider-429 branch so its substring `rate limit` no longer leaks into the wrong bucket. - `max_iterations` (new): catches the canonical agent-loop cap string ("Agent exceeded maximum tool iterations") via the existing `is_max_iterations_error` predicate so the user sees: *"The agent ran the maximum number of tool steps for one turn … You can retry the same question in this thread once the underlying limit clears."* — previously this fell through to the opaque generic `inference` bucket. - `rate_limited` (kept, enriched): now extracts `Retry-After` / `retry_after` seconds from the error body and appends a concrete hint to the user message ("Try again in 30 seconds" / "Try again in about 3 minutes" for windows ≥90s / "You can retry immediately" for 0). Fractional values round up so we never under-promise. Copy also now states the limit is upstream and that retrying in the same thread is fine. Tests (src/openhuman/channels/providers/web_tests.rs) - `classify_inference_error_distinguishes_action_budget_from_provider_429` — all three SecurityPolicy strings classify as `action_budget_exceeded` and the copy says "local safety cap" + "can keep chatting in this thread". - `classify_inference_error_max_iterations_gets_dedicated_branch` — the flattened web-channel error wrapper resolves to `max_iterations` with the same-thread recovery hint. - `classify_inference_error_rate_limited_surfaces_retry_after_seconds` — 30-second retry-after appears verbatim and the thread-recovery hint is present. - `classify_inference_error_rate_limited_no_retry_after_omits_hint` — 429 without Retry-After does NOT hallucinate a window. - `classify_inference_error_rate_limited_handles_fractional_and_minute_windows` — 2.4s rounds to 3s; 180s renders as "about 3 minutes". `cargo test --lib classify_inference_error` → 7 passed, 0 failed. `cargo check` clean. `cargo fmt` applied. What's intentionally out of scope - A real per-thread rate-limit Redux state: the trace shows none exists; `inferenceStatus` / `lifecycle` / `activeThread` all clear on `chat_error` in `ChatRuntimeProvider.onError`, and the FE test suite already exercises this. - A countdown timer in the composer: separate UI work; this PR keeps the fix server-side so the existing FE forwarder picks it up automatically.

coderabbitai · 2026-05-20T20:04:26Z

📝 Walkthrough

Walkthrough

The web channel error classifier now detects OpenHuman action-budget exhaustion, agent max-iteration tool-step exhaustion, and provider rate limits. It adds helpers to parse/format Retry-After timing, reorders classification priority to surface budgets first, and updates rate-limited messages to include optional retry timing hints.

Changes

Error classification improvements with retry messaging

Layer / File(s)	Summary
Rate-limit parsing and message formatting `src/openhuman/channels/providers/web.rs`	`parse_retry_after_secs_from_str` extracts numeric Retry-After values (ceiling seconds) from free-form error strings; `retry_after_hint` renders `Option<u64>` into user-facing suffixes (immediate, N seconds, or "about N minutes").
Action-budget detection `src/openhuman/channels/providers/web.rs`	`is_action_budget_exhausted` matches OpenHuman-local action-budget phrases in lowercased error text to classify local safety budget exhaustion.
Error classification logic and orchestration `src/openhuman/channels/providers/web.rs`	`classify_inference_error` is reordered to check action-budget first, then `is_max_iterations_error`, then provider `rate limit`/`429`; the `rate_limited` branch now uses parsed Retry-After hints and updated transient-provider messaging.
Test suite for error classification paths `src/openhuman/channels/providers/web_tests.rs`	Adds tests validating action-budget classification, max-iterations classification, and several rate-limit scenarios (with/without Retry-After, fractional-second rounding-up, minute-threshold formatting, pluralization, and parsing Retry-After inside JSON bodies).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

tinyhumansai/openhuman#2239: Modifies classify_inference_error branches; related to classifier changes and provider error mappings.

Suggested labels

working

Suggested reviewers

graycyrus

Poem

A rabbit hops through logs at night, 🐰
Counting seconds till retry feels right,
Budgets and iterations neatly pried,
Retry-After hints now stand beside—
Chat resumes when timers say "alright."

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: distinguishing rate-limit sources in the chat error classifier, which is the core objective of the PR.
Linked Issues check	✅ Passed	The PR directly addresses all key requirements from issue `#2364`: distinguishes rate-limit sources (provider 429s, action-budget, max-iterations), provides retry-after parsing and recovery guidance, prevents thread-level state stickiness through accurate classification, and adds comprehensive tests covering transient limits and same-thread retry behavior.
Out of Scope Changes check	✅ Passed	All changes are scoped to the inference error classifier in src/openhuman/channels/providers/web.rs and its corresponding tests; no unrelated alterations or dependencies outside the stated objectives are present.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/channels/providers/web.rs`:
- Around line 271-278: The retry hint currently floors minutes in the Some(n)
branch of the match on secs, producing "about 1 minutes" for 90–119s and
understating the wait; change the minutes calculation to round up (e.g., minutes
= (n + 59) / 60) in the Some(n) => branch and format the message using correct
singular/plural ("minute"/"minutes") when building the string for "Try again in
about {mins} minute(s).".
- Around line 243-263: The current prefix list in the retry-after parsing loop
misses quoted JSON keys like "\"retry_after\": 30"; update the prefix matching
in the loop (where variables lower and err are used and the num_str parsing
occurs) to also check for quoted key patterns (e.g. "\"retry_after\":",
"\"retry-after\":", "\"retry_after\" :", "\"retry-after\" :") or use a small
regex that matches optional surrounding quotes, optional whitespace, the key
(retry_after|retry-after) and the colon, then extract the same trailing numeric
substring and parse it as before so quoted JSON bodies return the retry hint.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3857d90f-f32f-4d2a-9bfa-1dd877069a70

📥 Commits

Reviewing files that changed from the base of the PR and between fa8d75f and 9f7a74e.

📒 Files selected for processing (2)

src/openhuman/channels/providers/web.rs
src/openhuman/channels/providers/web_tests.rs

…oted JSON retry_after + round minutes up Two CodeRabbit findings on src/openhuman/channels/providers/web.rs: 1. **Minor — quoted JSON retry_after.** A serialised provider body like `{"retry_after": 30}` would miss every prefix because the surrounding quote stopped `lower.find("retry_after:")` from matching, and the user lost the retry hint the upstream actually supplied. Normalise by stripping double quotes from the lowercased scan buffer before searching for prefixes. New test `classify_inference_error_rate_limited_parses_quoted_json_retry_after`. 2. **Minor — minute hint rounds up + uses singular/plural.** The `Some(n)` arm used `n / 60` (integer floor) and a hard-coded "minutes" suffix, so 90–119s rendered as "about 1 minutes" — both grammatically wrong and an instruction to retry sooner than the upstream allows. Round up via `(n / 60) + u64::from(n % 60 != 0)` and pick singular vs plural. New test `classify_inference_error_rate_limited_minute_window_uses_singular_and_rounds_up` pins 90s → "about 2 minutes" and 119s → "about 2 minutes". `cargo test --lib classify_inference_error` → 9 passed, 0 failed (7 pre-existing + 2 new).

coderabbitai

🧹 Nitpick comments (1)

src/openhuman/channels/providers/web.rs (1)

305-309: 💤 Low value

Consider case-sensitivity and pattern fragility.

The is_action_budget_exhausted function receives an already-lowercased string (err_lower) from the caller, but the function signature doesn't enforce or document this contract. If a caller passes a non-lowercased string, the match will silently fail.

Additionally, these exact substring matches are brittle if the SecurityPolicy error messages ever change slightly (e.g., "Action budget exhausted" vs "action budget exhausted").

🔧 Optional: Add defensive lowercasing and doc comment

 /// Detect the SecurityPolicy global hourly action-budget signal
 /// emitted by the built-in tools (`web_fetch`, `curl`, `http_request`,
 /// `polymarket`, `composio`, etc.) — see `src/openhuman/security/
 /// policy.rs::SecurityPolicy::is_rate_limited`.
+///
+/// # Arguments
+/// * `err_lower` - The error string, expected to be pre-lowercased by the caller.
 ///
 /// We match the canonical English strings those tools emit. This is
 /// load-bearing for issue `#2364`: before this check ran, any string
 /// containing "rate limit" was misclassified as a provider 429 and
 /// the user saw the generic "You're being rate-limited" copy, which
 /// hides that the cap is OpenHuman's own per-hour safety budget,
 /// not the upstream LLM provider.
 fn is_action_budget_exhausted(err_lower: &str) -> bool {
+    // Defensive: ensure lowercase even if caller forgets
+    let err_lower = err_lower.to_ascii_lowercase();
     err_lower.contains("rate limit exceeded: action budget exhausted")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/channels/providers/web.rs` around lines 305 - 309, The helper
is_action_budget_exhausted currently assumes its input is lowercased and uses
brittle exact substrings; change it to accept &str, perform defensive
normalization inside (e.g., let s = err.to_lowercase();) and run the contains
checks against s so callers need not lowercase; add a short doc comment above
is_action_budget_exhausted indicating it normalizes the input and what error
substrings it checks for, and consider keeping the three existing substrings but
document they are lowercased for future maintainers.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/openhuman/channels/providers/web.rs`:
- Around line 305-309: The helper is_action_budget_exhausted currently assumes
its input is lowercased and uses brittle exact substrings; change it to accept
&str, perform defensive normalization inside (e.g., let s = err.to_lowercase();)
and run the contains checks against s so callers need not lowercase; add a short
doc comment above is_action_budget_exhausted indicating it normalizes the input
and what error substrings it checks for, and consider keeping the three existing
substrings but document they are lowercased for future maintainers.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8ec2649e-61d8-49d3-90de-081fb75b091d

📥 Commits

Reviewing files that changed from the base of the PR and between 9f7a74e and 553ec29.

📒 Files selected for processing (2)

src/openhuman/channels/providers/web.rs
src/openhuman/channels/providers/web_tests.rs

CodeGhost21 requested a review from a team May 20, 2026 20:04

coderabbitai Bot added the working A PR that is being worked on by the team. label May 20, 2026

coderabbitai Bot requested changes May 20, 2026

View reviewed changes

Comment thread src/openhuman/channels/providers/web.rs

Comment thread src/openhuman/channels/providers/web.rs

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

coderabbitai Bot approved these changes May 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(channels): distinguish rate-limit sources in chat error classifier (#2364)#2371

fix(channels): distinguish rate-limit sources in chat error classifier (#2364)#2371
CodeGhost21 wants to merge 2 commits into
tinyhumansai:mainfrom
CodeGhost21:fix/2364-rate-limit-classification

CodeGhost21 commented May 20, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 20, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

CodeGhost21 commented May 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Submission Checklist

Impact

Tests

Related

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Commit & Branch

Validation Run

Validation Blocked

Behavior Changes

Parity Contract

Duplicate / Superseded PR Handling

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

CodeGhost21 commented May 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading