Skip to content

fix(tauri): retry main-window lookup on Windows after SW_SHOW (#3A)#2341

Open
oxoxDev wants to merge 1 commit into
tinyhumansai:mainfrom
oxoxDev:fix/sentry-3a-tray-window-race
Open

fix(tauri): retry main-window lookup on Windows after SW_SHOW (#3A)#2341
oxoxDev wants to merge 1 commit into
tinyhumansai:mainfrom
oxoxDev:fix/sentry-3a-tray-window-race

Conversation

@oxoxDev
Copy link
Copy Markdown
Contributor

@oxoxDev oxoxDev commented May 20, 2026

Summary

  • Wraps the app.get_webview_window(\"main\") lookup in show_main_window with a 5×10ms bounded retry on Windows so the tray callback doesn't observe None during the brief window where the Tauri runtime hasn't re-tracked the main WebviewWindow record after SW_SHOW.
  • Closes OPENHUMAN-TAURI-3A (~15 events) — the Sentry warn [tray] failed to show main window from menu: main window not found fires while the OS window has already come back visible, so the user sees the window restore but Sentry sees noise and the post-show polish chain (unminimize, Tauri-level set_focus, CEF set_focus) is skipped.
  • Pure Windows-cfg-gated helper. macOS / Linux paths unchanged — the race is specific to the SW_HIDE close-to-tray flow on Windows.

Problem

On Windows, the app's close button doesn't destroy the main window — it routes through set_main_window_hidden(true) which uses Win32 EnumWindows + SW_HIDE on the Chrome_WidgetWin_1 HWND so the user can restore from the tray (PR #1583). CEF treats the hidden host as gone, so the Tauri runtime drops its WebviewWindow record for \"main\" while the window is in the SW_HIDE state.

When the user clicks the tray menu's Show window item (or left-clicks the tray icon), show_main_window runs the Windows branch:

  1. set_main_window_hidden(false) issues SW_SHOW on the raw HWND — the OS window becomes visible immediately.
  2. app.get_webview(\"main\") + webview.show() nudges the runtime to re-track.
  3. app.get_webview_window(\"main\") synchronously expects a Some(_) back.

Step 3 races step 2 because the runtime re-tracks the window on the next event-loop tick, not synchronously with SW_SHOW. The lookup returns None, the ? propagates \"main window not found\", and the tray callback emits:

[tray] failed to show main window from menu: main window not found

User effect today: the SW_SHOW already worked, so the window IS visible — but the post-show polish chain (Tauri-level window.show(), unminimize(), set_focus(), CEF set_focus() for keyboard routing) is skipped, and Sentry pages on what looks like a fatal tray failure even though the user got what they wanted.

15 occurrences observed on Windows 10/11 since 2026-05-09 (release 0.53.22 → 0.54.x), spread across at least one Foshan (CN), one Hyderabad (IN), and several US users. The CEF runtime + SW_HIDE close-to-tray code has not been touched on main since the issue was first seen, so the race is still live in 0.54.x.

Solution

Introduce get_main_webview_window_with_retry(app) next to show_main_window:

fn get_main_webview_window_with_retry(
    app: &AppHandle<AppRuntime>,
) -> Option<tauri::WebviewWindow<AppRuntime>> {
    #[cfg(target_os = \"windows\")]
    {
        const ATTEMPTS: usize = 5;
        const BACKOFF: std::time::Duration = std::time::Duration::from_millis(10);
        for attempt in 0..ATTEMPTS {
            if let Some(window) = app.get_webview_window(\"main\") {
                if attempt > 0 {
                    log::debug!(
                        \"[show_main_window] runtime re-tracked main window after {} retries\",
                        attempt
                    );
                }
                return Some(window);
            }
            if attempt + 1 < ATTEMPTS {
                std::thread::sleep(BACKOFF);
            }
        }
        None
    }
    #[cfg(not(target_os = \"windows\"))]
    {
        app.get_webview_window(\"main\")
    }
}

show_main_window swaps the single-call lookup for get_main_webview_window_with_retry(app). Worst case: 50 ms of std::thread::sleep on the tray callback thread. The tray menu is already closed by the time the user clicks an item, so the small delay is invisible — the user sees SW_SHOW (window appears) immediately, and the focus/unminimize polish lands one tick late.

Hard cap at 5 attempts preserves the original error path if the runtime never re-tracks — that would be a real lifecycle bug worth a Sentry event, not a race worth papering over.

Non-Windows platforms route through the single-call path because:

  • macOS close button routes through app.hide() (PR Mac Close button does not dismiss the app window #2049) — the WebviewWindow record stays intact.
  • Linux/X11 keeps the WebviewWindow record across WM_DELETE_WINDOW — the close handler just hides without dropping the host.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

  • N/A: pure platform-cfg-gated retry helper around an existing Tauri API call. Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy — the race is a runtime timing bug between SW_SHOW (raw HWND) and Tauri's event-loop re-track step, not a unit-testable code path. Validation plan in the Impact section below.
  • N/A: only added rustdoc + a Windows-cfg-gated branch + a one-line call-site swap. Diff coverage ≥ 80% — changed lines (Vitest + cargo-llvm-cov merged via diff-cover) meet the gate enforced by .github/workflows/coverage.yml.
  • N/A: behaviour-only change, no feature row impact. Coverage matrix updated — added/removed/renamed feature rows in docs/TEST-COVERAGE-MATRIX.md reflect this change.
  • All affected feature IDs from the matrix are listed in the PR description under ## Related.
  • No new external network dependencies introduced (mock backend used per Testing Strategy).
  • N/A: classifier-adjacent fix; no release-cut surface (tray show-from-menu was already in the release smoke). Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md).
  • Linked issue closed via Closes #NNN in the ## Related section.

Impact

  • Runtime: on Windows, the tray Show window / tray icon left-click callback can now block its caller for up to 50 ms in the race window. The tray menu is closed by the time we reach this code path; no user-visible latency. macOS and Linux unchanged.
  • Performance: negligible — 5×10ms is the worst case and only fires when the race triggers (which is exactly the case the current code paths emit a warn for). The fast path (window already re-tracked) is a single get_webview_window call as before.
  • Security: no new attack surface; helper is private, single-purpose, and doesn't reach beyond app.get_webview_window.
  • Migration: none.
  • Compatibility: forward-compatible. If/when Tauri synchronously re-tracks the window after a host-visibility change (or our SW_SHOW path stops dropping the record at all), the retry loop short-circuits on the first attempt and the helper degrades to a no-op overhead.
  • Validation plan (no unit test added — see Submission Checklist above):
    • CI runs cargo check --manifest-path app/src-tauri/Cargo.toml and cargo clippy --manifest-path app/src-tauri/Cargo.toml -- -D warnings on Windows. Both pass locally on macOS; CI Windows runner exercises the cfg-gated body.
    • Manual smoke on Windows: launch dev:app, close to tray, right-click tray → Show window, verify focus + unminimize land and no [tray] failed to show main window from menu: main window not found lands in stderr. Repeat with left-click tray icon. (I can't smoke this locally on macOS — flagging honestly.)

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Keep this section for AI-authored PRs. For human-only PRs, mark each field N/A.

Linear Issue

  • Key: N/A (Sentry-driven triage; no Linear ticket)
  • URL: N/A

Commit & Branch

  • Branch: fix/sentry-3a-tray-window-race
  • Commit SHA: ca240f9 (tip)

Agent

  • Claude Code (Opus 4.7, 1M context) running locally

Summary by CodeRabbit

  • Bug Fixes
    • Improved main window display reliability by fixing a race condition on Windows that could prevent the window from appearing correctly.

Review Change Stack

…MAN-TAURI-3A)

Wrap the `app.get_webview_window("main")` call in `show_main_window`
with a 5×10ms bounded retry on Windows. The runtime drops its
`WebviewWindow` record for "main" while the close-to-tray flow has the
window hidden via `SW_HIDE`, and re-tracks it on the next event-loop
tick after `set_main_window_hidden(false)` issues `SW_SHOW` on the raw
HWND. The tray "Show window" callback can hit the lookup before that
re-track step lands, returning None even though the OS window is
visible — Sentry sees `[tray] failed to show main window from menu:
main window not found` while the user sees the window come back.

The 50 ms worst-case sleep is invisible to the user (tray menu is
closed by the time we get here) and the hard cap preserves the
existing error path if the runtime never re-tracks — that would
indicate a real lifecycle bug, not the race we're papering over.

Non-Windows platforms keep the single-call path because the race is
specific to the SW_HIDE close-to-tray flow added in tinyhumansai#1583; macOS
routes close through `app.hide()` (tinyhumansai#2049) and Linux/X11 keeps the
`WebviewWindow` record across `WM_DELETE_WINDOW`.

Closes OPENHUMAN-TAURI-3A
@oxoxDev oxoxDev requested a review from a team May 20, 2026 11:32
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 509f8726-e89a-4686-b143-c5c18f9d4f04

📥 Commits

Reviewing files that changed from the base of the PR and between 41e7631 and ca240f9.

📒 Files selected for processing (1)
  • app/src-tauri/src/lib.rs

📝 Walkthrough

Walkthrough

The PR adds retry logic to handle a Windows-specific race condition where setting window visibility (SW_SHOW) causes Tauri to temporarily drop the main webview window record. A new helper function attempts repeated lookups with bounded delay before returning None, while non-Windows platforms perform a single lookup. The show_main_window function is updated to use this helper.

Changes

Windows Main Window Retry Handler

Layer / File(s) Summary
Retry helper and integration
app/src-tauri/src/lib.rs
get_main_webview_window_with_retry performs bounded retries on Windows (up to 5 attempts with short sleeps) when looking up the "main" webview window, while non-Windows platforms do a single lookup. show_main_window is updated to call this helper instead of direct app.get_webview_window("main"), and inline comments are updated to explain that window re-tracking is not synchronous with SW_SHOW and that the retry bounds prevent the tray callback from missing the re-tracked window.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

A rabbit hops through Windows gates,
Where windows hide and briefly wait—
Five swift retries, then off we go,
The main one found, the tray will know. 🐰✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: a Windows-specific fix for retrying main-window lookup after SW_SHOW to resolve a race condition.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@M3gA-Mind M3gA-Mind self-assigned this May 20, 2026
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice detective work tracing the SW_HIDE → CEF record-drop → None race back to OPENHUMAN-TAURI-3A. The bounded-retry approach is reasonable in principle, the cfg gating is correct, and the doc comment is one of the best I've seen on a helper this small.

One concern about the retry mechanism that I think needs addressing before merge — see inline comment.

Comment thread app/src-tauri/src/lib.rs
attempt
);
}
return Some(window);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[major] std::thread::sleep blocks the current thread without pumping the Win32 message loop. The PR description says the runtime re-tracks the window "on the next event-loop tick" after SW_SHOW — but if the tray callback runs on the main thread (which is the thread driving the message pump), sleeping here prevents that tick from ever firing. The retry loop would exhaust all 5 attempts and still return None, making the fix a no-op in exactly the scenario it targets.

Could you verify which thread the tray on_menu_event handler dispatches on? If it's the main/UI thread, you'd need to pump messages between attempts instead of sleeping. Something like:

// Pump pending messages to let the runtime process the
// WM_SHOWWINDOW cascade that triggers re-tracking.
unsafe {
    let mut msg: MSG = std::mem::zeroed();
    while PeekMessageW(&mut msg, std::ptr::null_mut(), 0, 0, PM_REMOVE) != 0 {
        TranslateMessage(&msg);
        DispatchMessageW(&msg);
    }
}

Alternatively, if the re-tracking is driven by CEF's internal thread (not the Win32 message pump), then std::thread::sleep would work fine — but the PR description's "next event-loop tick" wording suggests the main loop. Worth confirming either way, since if the sleep deadlocks the re-track the fix is invisible and OPENHUMAN-TAURI-3A stays live.

@M3gA-Mind M3gA-Mind removed their assignment May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants