fix(tauri): retry main-window lookup on Windows after SW_SHOW (#3A)#2341
fix(tauri): retry main-window lookup on Windows after SW_SHOW (#3A)#2341oxoxDev wants to merge 1 commit into
Conversation
…MAN-TAURI-3A)
Wrap the `app.get_webview_window("main")` call in `show_main_window`
with a 5×10ms bounded retry on Windows. The runtime drops its
`WebviewWindow` record for "main" while the close-to-tray flow has the
window hidden via `SW_HIDE`, and re-tracks it on the next event-loop
tick after `set_main_window_hidden(false)` issues `SW_SHOW` on the raw
HWND. The tray "Show window" callback can hit the lookup before that
re-track step lands, returning None even though the OS window is
visible — Sentry sees `[tray] failed to show main window from menu:
main window not found` while the user sees the window come back.
The 50 ms worst-case sleep is invisible to the user (tray menu is
closed by the time we get here) and the hard cap preserves the
existing error path if the runtime never re-tracks — that would
indicate a real lifecycle bug, not the race we're papering over.
Non-Windows platforms keep the single-call path because the race is
specific to the SW_HIDE close-to-tray flow added in tinyhumansai#1583; macOS
routes close through `app.hide()` (tinyhumansai#2049) and Linux/X11 keeps the
`WebviewWindow` record across `WM_DELETE_WINDOW`.
Closes OPENHUMAN-TAURI-3A
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe PR adds retry logic to handle a Windows-specific race condition where setting window visibility ( ChangesWindows Main Window Retry Handler
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
graycyrus
left a comment
There was a problem hiding this comment.
Nice detective work tracing the SW_HIDE → CEF record-drop → None race back to OPENHUMAN-TAURI-3A. The bounded-retry approach is reasonable in principle, the cfg gating is correct, and the doc comment is one of the best I've seen on a helper this small.
One concern about the retry mechanism that I think needs addressing before merge — see inline comment.
| attempt | ||
| ); | ||
| } | ||
| return Some(window); |
There was a problem hiding this comment.
[major] std::thread::sleep blocks the current thread without pumping the Win32 message loop. The PR description says the runtime re-tracks the window "on the next event-loop tick" after SW_SHOW — but if the tray callback runs on the main thread (which is the thread driving the message pump), sleeping here prevents that tick from ever firing. The retry loop would exhaust all 5 attempts and still return None, making the fix a no-op in exactly the scenario it targets.
Could you verify which thread the tray on_menu_event handler dispatches on? If it's the main/UI thread, you'd need to pump messages between attempts instead of sleeping. Something like:
// Pump pending messages to let the runtime process the
// WM_SHOWWINDOW cascade that triggers re-tracking.
unsafe {
let mut msg: MSG = std::mem::zeroed();
while PeekMessageW(&mut msg, std::ptr::null_mut(), 0, 0, PM_REMOVE) != 0 {
TranslateMessage(&msg);
DispatchMessageW(&msg);
}
}Alternatively, if the re-tracking is driven by CEF's internal thread (not the Win32 message pump), then std::thread::sleep would work fine — but the PR description's "next event-loop tick" wording suggests the main loop. Worth confirming either way, since if the sleep deadlocks the re-track the fix is invisible and OPENHUMAN-TAURI-3A stays live.
Summary
app.get_webview_window(\"main\")lookup inshow_main_windowwith a 5×10ms bounded retry on Windows so the tray callback doesn't observeNoneduring the brief window where the Tauri runtime hasn't re-tracked the mainWebviewWindowrecord afterSW_SHOW.[tray] failed to show main window from menu: main window not foundfires while the OS window has already come back visible, so the user sees the window restore but Sentry sees noise and the post-show polish chain (unminimize, Tauri-levelset_focus, CEFset_focus) is skipped.SW_HIDEclose-to-tray flow on Windows.Problem
On Windows, the app's close button doesn't destroy the main window — it routes through
set_main_window_hidden(true)which uses Win32EnumWindows+SW_HIDEon theChrome_WidgetWin_1HWND so the user can restore from the tray (PR #1583). CEF treats the hidden host as gone, so the Tauri runtime drops itsWebviewWindowrecord for\"main\"while the window is in the SW_HIDE state.When the user clicks the tray menu's Show window item (or left-clicks the tray icon),
show_main_windowruns the Windows branch:set_main_window_hidden(false)issuesSW_SHOWon the raw HWND — the OS window becomes visible immediately.app.get_webview(\"main\")+webview.show()nudges the runtime to re-track.app.get_webview_window(\"main\")synchronously expects aSome(_)back.Step 3 races step 2 because the runtime re-tracks the window on the next event-loop tick, not synchronously with
SW_SHOW. The lookup returnsNone, the?propagates\"main window not found\", and the tray callback emits:User effect today: the SW_SHOW already worked, so the window IS visible — but the post-show polish chain (Tauri-level
window.show(),unminimize(),set_focus(), CEFset_focus()for keyboard routing) is skipped, and Sentry pages on what looks like a fatal tray failure even though the user got what they wanted.15 occurrences observed on Windows 10/11 since 2026-05-09 (release 0.53.22 → 0.54.x), spread across at least one Foshan (CN), one Hyderabad (IN), and several US users. The CEF runtime + SW_HIDE close-to-tray code has not been touched on main since the issue was first seen, so the race is still live in 0.54.x.
Solution
Introduce
get_main_webview_window_with_retry(app)next toshow_main_window:show_main_windowswaps the single-call lookup forget_main_webview_window_with_retry(app). Worst case: 50 ms ofstd::thread::sleepon the tray callback thread. The tray menu is already closed by the time the user clicks an item, so the small delay is invisible — the user seesSW_SHOW(window appears) immediately, and the focus/unminimize polish lands one tick late.Hard cap at 5 attempts preserves the original error path if the runtime never re-tracks — that would be a real lifecycle bug worth a Sentry event, not a race worth papering over.
Non-Windows platforms route through the single-call path because:
app.hide()(PR Mac Close button does not dismiss the app window #2049) — theWebviewWindowrecord stays intact.WebviewWindowrecord acrossWM_DELETE_WINDOW— the close handler just hides without dropping the host.Submission Checklist
diff-cover) meet the gate enforced by.github/workflows/coverage.yml.docs/TEST-COVERAGE-MATRIX.mdreflect this change.## Related.docs/RELEASE-MANUAL-SMOKE.md).Closes #NNNin the## Relatedsection.Impact
get_webview_windowcall as before.app.get_webview_window.cargo check --manifest-path app/src-tauri/Cargo.tomlandcargo clippy --manifest-path app/src-tauri/Cargo.toml -- -D warningson Windows. Both pass locally on macOS; CI Windows runner exercises the cfg-gated body.[tray] failed to show main window from menu: main window not foundlands in stderr. Repeat with left-click tray icon. (I can't smoke this locally on macOS — flagging honestly.)Related
feat(tauri): hide main window to tray on Windows close) introduced the SW_HIDE flow; PR Windows X click leaves window on screen because hide() is a no-op in CEF runtime #1607 (fix(tauri): retry main-window lookup) hardened against the same record-drop on a different code path; PR Mac Close button does not dismiss the app window #2049 (fix(shell): use app-level hide on macOS close button) is why this PR is Windows-only.AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
Agent
Summary by CodeRabbit