security(daemon): cap browser connections per session in wsproxy#289
security(daemon): cap browser connections per session in wsproxy#289mgabor3141 wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
Try this PRcurl -sSfL https://gmux.app/install-pr.sh | sh -s -- 289Built from |
c5cc784 to
e1e2319
Compare
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
@greptile review |
|
| Filename | Overview |
|---|---|
| services/gmuxd/internal/wsproxy/wsproxy.go | Adds per-session connection cap (maxClientsPerSession=8): addConn now returns bool and enforces the limit under the same lock; slot is reserved before backend dial and released on failure; refused connections receive StatusTryAgainLater with a clear reason. Logic is correct and atomic. |
| services/gmuxd/internal/wsproxy/wsproxy_test.go | New test file covering cap enforcement, per-session isolation, and freed-slot reuse; integration test drives the full HTTP handler. Minor: the early-return path in TestHandlerRefusesBeyondCapWithCloseReason skips the close-code, close-reason, and viewer-count assertions when dial errors out. |
Sequence Diagram
sequenceDiagram
participant B as Browser
participant P as wsproxy Handler
participant S as p.sessions (locked)
participant R as gmux-run Unix socket
B->>P: HTTP WebSocket Upgrade
P->>P: websocket.Accept()
P->>S: addConn(sessionID, conn)
alt "at cap (len >= maxClientsPerSession)"
S-->>P: return false
P->>B: Close(StatusTryAgainLater, "too many connections")
P-->>P: return
else slot available
S-->>P: "append & return true"
P->>R: websocket.Dial(backend)
alt backend dial fails
R-->>P: error
P->>S: removeConn(sessionID, conn)
P->>B: Close(StatusInternalError, "backend unavailable")
P-->>P: return
else dial succeeds
R-->>P: backendConn
P->>P: spawn 3 goroutines (B to R, R to B, keepalive)
P->>P: wg.Wait()
P->>B: Close(StatusNormalClosure)
P->>R: Close(StatusNormalClosure)
P->>S: removeConn(sessionID, conn)
end
end
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
services/gmuxd/internal/wsproxy/wsproxy_test.go:168-176
**Early-return silently skips the test's key assertions**
If `dial()` returns a non-nil error for the cap+1 connection (which the comment acknowledges as possible), the test exits without ever checking the `StatusTryAgainLater` close code, the "too many connections" close reason, or the post-refusal viewer count. The PR description specifically claims the connection is refused "with a clear WebSocket close reason (`StatusTryAgainLater`)" — but if this branch fires, none of that is verified and the test still passes green. Consider moving the three assertions (close code, reason, viewer count) to run unconditionally before this early return, or at minimum assert them in the nil-error branch without allowing a silent pass for the non-nil case.
Reviews (1): Last reviewed commit: "security(daemon): cap browser connection..." | Re-trigger Greptile
| if n == maxClientsPerSession { | ||
| break | ||
| } | ||
| if time.Now().After(deadline) { | ||
| t.Fatalf("only %d/%d connections registered", n, maxClientsPerSession) | ||
| } | ||
| time.Sleep(5 * time.Millisecond) | ||
| } | ||
|
|
There was a problem hiding this comment.
Early-return silently skips the test's key assertions
If dial() returns a non-nil error for the cap+1 connection (which the comment acknowledges as possible), the test exits without ever checking the StatusTryAgainLater close code, the "too many connections" close reason, or the post-refusal viewer count. The PR description specifically claims the connection is refused "with a clear WebSocket close reason (StatusTryAgainLater)" — but if this branch fires, none of that is verified and the test still passes green. Consider moving the three assertions (close code, reason, viewer count) to run unconditionally before this early return, or at minimum assert them in the nil-error branch without allowing a silent pass for the non-nil case.
Prompt To Fix With AI
This is a comment left during a code review.
Path: services/gmuxd/internal/wsproxy/wsproxy_test.go
Line: 168-176
Comment:
**Early-return silently skips the test's key assertions**
If `dial()` returns a non-nil error for the cap+1 connection (which the comment acknowledges as possible), the test exits without ever checking the `StatusTryAgainLater` close code, the "too many connections" close reason, or the post-refusal viewer count. The PR description specifically claims the connection is refused "with a clear WebSocket close reason (`StatusTryAgainLater`)" — but if this branch fires, none of that is verified and the test still passes green. Consider moving the three assertions (close code, reason, viewer count) to run unconditionally before this early return, or at minimum assert them in the nil-error branch without allowing a silent pass for the non-nil case.
How can I resolve this? If you propose a fix, please make it concise.
Implements review ticket T25.
The wsproxy held an unbounded number of browser WebSocket connections per session (
p.sessions[id]). An authed-but-buggy client stuck in a reconnect loop (the #242 storm) could pile up connections without limit, each carrying proxy goroutines plus a 4 MiB backend read buffer — a self-DoS / runaway-client footgun with no shedding. This adds a generous-but-bounded per-session cap (maxClientsPerSession = 8): connections beyond the cap are refused with a clear WebSocket close reason (StatusTryAgainLater, "too many connections for this session") rather than evicting healthy viewers, so a storming client can't knock an existing tab offline. The check and registration happen under the same lock, so concurrent connects can't race past the cap. The cap is per-session, so legit multi-tab / phone+desktop use of other sessions is unaffected, and a freed slot (viewer disconnect) is immediately reusable by a reconnecting client.Scope note: the ticket also mentions an optional per-remote-addr cap on
/v1/eventsSSE, but that part overlaps T14 and is left out of this focused change. The runner-sideptyserver.s.clientsmap is auth-gated behind an owner-only Unix socket; the hub-facing wsproxy is the meaningful blast-radius boundary, so the cap lives there.Verification
cd services/gmuxd && go build ./...— passesgo test ./internal/wsproxy/— new tests cover: refusal beyond the cap, slot not stored on refusal, cap is per-session, and a freed slot is reusablego test ./...— full daemon suite passesSource finding: T25 (P2) connection caps.