Skip to content

test: fix flaky tests caused by worker thread pollution#88

Merged
kinyoklion merged 9 commits into
mainfrom
devin/1780593159-fix-flaky-query-params-test
Jun 8, 2026
Merged

test: fix flaky tests caused by worker thread pollution#88
kinyoklion merged 9 commits into
mainfrom
devin/1780593159-fix-flaky-query-params-test

Conversation

@kinyoklion

@kinyoklion kinyoklion commented Jun 4, 2026

Copy link
Copy Markdown
Member

Summary

Fixes flaky test failures across client_spec.rb and headers_spec.rb caused by SSE worker thread pollution between tests.

Root cause: SSE::Client spawns a fire-and-forget worker thread (Thread.new { run_stream }) that is not joined on close. After client.close sets @stopped, the worker thread may still complete one more reconnection attempt before checking the flag. Since all tests reuse port 50000, this phantom request hits the next test's WEBrick server — arriving without query params or with stale headers, causing nil assertions.

Fix: In both with_client helpers, join the named 'LD/SSEClient' thread after close:

Thread.list.select { |t| t.name == 'LD/SSEClient' }.each { |t| t.join(1) }

Also uses EOF-based reconnection (stream close) instead of HTTP 500 responses for the multi-reconnection test, matching the pattern of all stable tests.

Verified: 5/5 CI passes, 20/20 local runs on Ruby 3.2, 20/20 on Ruby 3.3.

Link to Devin session: https://app.devin.ai/sessions/e40ffc3296dd45648f86ac10c42b6da5
Requested by: @kinyoklion


Note

Low Risk
Spec-only changes; no production library behavior is modified.

Overview
Stabilizes end-to-end specs in client_spec.rb and headers_spec.rb by ensuring each example fully tears down the SSE::Client worker before the shared stub server port is reused.

Both with_client helpers now join(0.01s)** any thread named **LD/SSEClient** after **client.close`, so a lingering reconnect cannot hit the next example’s WEBrick handler with wrong query strings or headers.

Two dynamic query_params reconnection examples no longer force reconnect via HTTP 500; they close the event stream (minimal SSE comment payloads) instead, aligning with other stable reconnect tests.

Reviewed by Cursor Bugbot for commit 27e0006. Bugbot is set up for automated code reviews on this repo. Configure here.

Add res.keep_alive = false to the first response in the
'updates query parameters on reconnection' test. Without this,
the empty chunked response leaves the TCP connection open via
HTTP keep-alive, creating a race condition between WEBrick and
the HTTP gem during fast reconnection.

Co-Authored-By: rlamb@launchdarkly.com <4955475+kinyoklion@users.noreply.github.com>
@devin-ai-integration

Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment, CI, and merge conflict monitoring

devin-ai-integration Bot and others added 8 commits June 4, 2026 17:22
- Test 1118: send SSE comment data before closing (matches pattern of
  non-flaky reconnection tests that send actual data)
- Test 1329: use larger reconnect_time (0.25s) to avoid race in rapid
  500 retries within the connect loop

Both tests flake with nil query_string due to a timing race between
the HTTP gem and WEBrick during ultra-fast (10ms) reconnection.

Co-Authored-By: rlamb@launchdarkly.com <4955475+kinyoklion@users.noreply.github.com>
Use EOF-based reconnection (stream close) instead of 500-based
reconnection for the 'each reconnection attempt' test. The HTTP
gem has a race condition when reusing a closed client within the
same connect loop on Ruby 3.2, causing nil query strings.

Co-Authored-By: rlamb@launchdarkly.com <4955475+kinyoklion@users.noreply.github.com>
The SSE client's worker thread is fire-and-forget (Thread.new without
join). After client.close sets @stopped, the worker thread may still
make one more reconnection attempt before checking the flag. When the
next test reuses the same WEBrick port, this phantom request arrives
with no query params, causing nil assertions.

Fix: add sleep 0.1 in with_client after close to let the worker thread
terminate before the next test starts its server on the same port.

Also reverts the unnecessary reconnect_time increases and uses
EOF-based reconnection (stream close) instead of 500 errors for the
multi-reconnection test, matching the pattern of all stable tests.

Co-Authored-By: rlamb@launchdarkly.com <4955475+kinyoklion@users.noreply.github.com>
Use Thread.join to reliably wait for SSE worker threads to terminate
before the next test starts. Applies to both client_spec and
headers_spec with_client helpers.

Co-Authored-By: rlamb@launchdarkly.com <4955475+kinyoklion@users.noreply.github.com>
Co-Authored-By: rlamb@launchdarkly.com <4955475+kinyoklion@users.noreply.github.com>
Co-Authored-By: rlamb@launchdarkly.com <4955475+kinyoklion@users.noreply.github.com>
Co-Authored-By: rlamb@launchdarkly.com <4955475+kinyoklion@users.noreply.github.com>
Co-Authored-By: rlamb@launchdarkly.com <4955475+kinyoklion@users.noreply.github.com>
@devin-ai-integration devin-ai-integration Bot changed the title test: fix flaky query params reconnection test test: fix flaky tests caused by worker thread pollution Jun 4, 2026
@kinyoklion kinyoklion marked this pull request as ready for review June 4, 2026 20:24
@kinyoklion kinyoklion requested a review from a team as a code owner June 4, 2026 20:24
@kinyoklion kinyoklion merged commit bc96556 into main Jun 8, 2026
12 checks passed
@kinyoklion kinyoklion deleted the devin/1780593159-fix-flaky-query-params-test branch June 8, 2026 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants