Skip to content

fix(runners): critical correctness bugs in runner registration flow#3916

Draft
cursor[bot] wants to merge 3 commits into
developfrom
cursor/critical-correctness-bugs-efb4
Draft

fix(runners): critical correctness bugs in runner registration flow#3916
cursor[bot] wants to merge 3 commits into
developfrom
cursor/critical-correctness-bugs-efb4

Conversation

@cursor

@cursor cursor Bot commented Jun 3, 2026

Copy link
Copy Markdown

Summary

Automated bug scan of recent runner registration commits (b5dc8c7b) found three critical correctness issues. This PR applies minimal, targeted fixes with tests.

Bug 1: Bolt runner stays active after registration reset

Impact: Tasks stuck in starting status indefinitely on Bolt/SQLite deployments.

Root cause: ResetRunnerRegistration in db/bolt/global_runner.go cleared the auth token but did not set Active=false, unlike the SQL implementation. The task scheduler continued assigning work to a runner that could no longer authenticate.

Fix: Set runner.Active = false in the Bolt reset path. Added Test_ResetRunnerRegistration_DeactivatesRunner and enabled the previously commented Active assertion in runner_svc_test.go.

Trigger: Admin regenerates a registration token for an already-registered, active runner while using Bolt storage.

Bug 2: Runner unregister always fails with 401

Impact: semaphore runner unregister never removes the runner from the server; local token file is not cleaned up on auth failure.

Root cause: JobPool.Unregister() sent DELETE /api/internal/runners without the X-Runner-Token header required by RunnerMiddleware.

Fix: Add req.Header.Set("X-Runner-Token", util.Config.Runner.Token) before the DELETE request.

Bug 3: SQL one-time registration token race

Impact: Concurrent registration with the same one-time token could leave one registrant with an invalid auth token.

Root cause: RegisterRunner used SELECT → validate → UPDATE without an atomic consume guard on the registration token column.

Fix: Include registration_token=? in the UPDATE WHERE clause and fail if zero rows are affected.

Validation

  • go test ./db/bolt/... ./services/server/... — all pass, including new Test_ResetRunnerRegistration_DeactivatesRunner
Open in Web View Automation 

cursoragent and others added 3 commits June 3, 2026 11:06
ResetRunnerRegistration in the Bolt store cleared the auth token but left
Active=true, unlike the SQL implementation. The task scheduler kept assigning
work to a runner that could no longer authenticate, leaving tasks stuck in
starting status indefinitely.

Co-authored-by: Denis Gukov <fiftin@outlook.com>
The unregister client omitted the X-Runner-Token header required by
RunnerMiddleware, so every DELETE /api/internal/runners request returned 401
and the runner was never removed from the server.

Co-authored-by: Denis Gukov <fiftin@outlook.com>
RegisterRunner used a read-then-update pattern without guarding the UPDATE,
so concurrent registration requests with the same one-time token could both
succeed and leave one registrant with an invalid auth token.

Co-authored-by: Denis Gukov <fiftin@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant