Skip to content

fix: re-apply BOOST_NOINLINE to manage_exception_state (reverted in #324 merge)#331

Merged
olk merged 1 commit intoboostorg:developfrom
vasily-sviridov:sviridov-patch-exceptions-tls-fixes
May 8, 2026
Merged

fix: re-apply BOOST_NOINLINE to manage_exception_state (reverted in #324 merge)#331
olk merged 1 commit intoboostorg:developfrom
vasily-sviridov:sviridov-patch-exceptions-tls-fixes

Conversation

@vasily-sviridov
Copy link
Copy Markdown
Contributor

What happened

This is a re-application of the fix from #324. That PR was merged on Mar 10 with BOOST_NOINLINE on the ctor/dtor of manage_exception_state, but the fix doesn't appear to be in develop anymore - not sure exactly what happened, looks like it may have been reverted after the build failures reported by @pdimov and @grisumbras. Either way, the fix is gone and the issue is still hitting us in production.

This PR just brings back commit da3fa4f from #324, without any other changes.

Why BOOST_NOINLINE

__cxa_get_globals() is a pure/const-attributed function, so the compiler is free to cache its return value across the constructor and destructor. If the fiber migrates to another thread between the two calls, both end up reading/writing the TLS slot of the original thread - wrong exception state on resume. The disassembly showing the single __cxa_get_globals call is in the original issue: #323.

Making the ctor/dtor noinline forces the compiler to treat them as opaque call sites and re-evaluate __cxa_get_globals() each time.

Why not volatile

The volatile cast (@CreoValis's suggestion in #324) prevents the compiler from optimizing away the dereference, but the pointer returned by __cxa_get_globals() can still be cached - it's the address computation that's the problem, not the access through it. The build failures on GCC confirmed it doesn't even compile cleanly, and it doesn't fix the root cause anyway.

Prior art

Same fundamental issue is documented in userver-framework/userver#242, along with upstream compiler issues: llvm/llvm-project#98479, GCC #26461. The userver team's conclusion after investigating was the same: noinline is the only reliable workaround without a dedicated compiler flag for fiber-safe TLS.

We're hitting this in production on Boost 1.87+ - the symptom is exception state ending up in the wrong thread after a fiber resumes on a different one.

Fixes #323

@olk olk merged commit d2142b6 into boostorg:develop May 8, 2026
21 of 26 checks passed
@olk
Copy link
Copy Markdown
Member

olk commented May 8, 2026

ty

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Exception state can be saved to incorrect thread in 1.88

2 participants