Skip to content

fix(middleware): DefaultErrorHandler responds 503 instead of panicking on store error#266

Open
desperatee wants to merge 1 commit into
ulule:masterfrom
desperatee:fix/default-error-handler-no-panic
Open

fix(middleware): DefaultErrorHandler responds 503 instead of panicking on store error#266
desperatee wants to merge 1 commit into
ulule:masterfrom
desperatee:fix/default-error-handler-no-panic

Conversation

@desperatee

Copy link
Copy Markdown

Summary

All three middleware adapters define the same broken default:

// drivers/middleware/stdlib/options.go:30-33
// drivers/middleware/gin/options.go:30-33
// drivers/middleware/fasthttp/options.go:28-31

func DefaultErrorHandler(...) {
    panic(err)
}

Panic-on-store-error has three downsides:

  1. Stdlib/gin: net/http recovers the panic into a 500, but every recovered panic generates a stack trace and runs the full deferred chain. During a Redis outage, every request panics — turning a small store hiccup into a CPU + log spike. Operators see 500s with stack traces; the actual root cause (transient store error) is buried.
  2. fasthttp: no built-in panic recovery. Connections drop rather than returning a clean error.
  3. Semantic: a rate-limiter store error is a service-availability issue, not a programmer error. 503 (Service Unavailable) is the right HTTP status; panic is not.

Fix

Replace panic(err) with:

log.Printf("limiter: store error: %v", err)
// then write 503 via the adapter's idiomatic mechanism:
//   stdlib:   http.Error(w, "Service Unavailable", http.StatusServiceUnavailable)
//   gin:      c.String(http.StatusServiceUnavailable, "Service Unavailable")
//   fasthttp: ctx.SetStatusCode(fasthttp.StatusServiceUnavailable); ctx.Response.SetBodyString(...)

This is a sensible default. Operators who want fail-open or a custom status can override via WithErrorHandler.

⚠️ Behavior change

The default response on store error changes from "500 + stack trace + log noise" (stdlib/gin) or "dropped connection" (fasthttp) to "503 + log line". Operators who were deliberately relying on the panic + recovery flow (unusual) need to add a custom WithErrorHandler to retain the old behavior.

Design alternatives considered

  • Fail-open silently (_ = err): too dangerous. Rate limiting silently disabled is a much worse default than 503.
  • Configurable strategy as a public type (e.g. LimiterErrorStrategy enum): more flexible but enlarges the API surface for an issue most users will never override.

503-by-default + override-via-WithErrorHandler is the minimal change that fixes the panic while keeping the API surface stable.

Verification

go test -race ./... (minus Redis driver, which needs local Redis) is green. The PoC test panics on master and returns 503 after the patch.

Severity

HIGH — production-grade DoS amplification: a transient store error becomes elevated CPU + log volume across the entire fleet. Affects all three adapters identically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant