fix(snuba): distinguish upstream proxy errors from snuba errors#115252
Draft
fix(snuba): distinguish upstream proxy errors from snuba errors#115252
Conversation
When envoy (or another upstream proxy) returns a 504 with body
"upstream request timeout" before reaching snuba, json.loads fails and
we raise a generic SnubaError("Failed to parse snuba error response").
This obscures the real failure mode: the request never reached snuba,
so there are no snuba traces and the cause appears in the proxy's
response body, not in snuba logs.
Introduce two new exception classes:
- InvalidSnubaResponseError: non-2xx response that can't be parsed as
JSON. Captures status and body for debugging.
- SnubaUpstreamRequestTimeout (subclass): 504, or any status with the
envoy-default body "upstream request timeout".
Both subclass SnubaError so existing handlers continue to work.
Agent transcript: https://claudescope.sentry.dev/share/Issjz4igy1xszXogQRdyfKOkdGzESNNiXeq7QfRIlOI
as TimeoutException (HTTP 504) — same path the frontend already handles for QueryExecutionTimeMaximum and ReadTimeoutError. Agent transcript: https://claudescope.sentry.dev/share/A7_ad-be8QOnVd4HEJMyDyxUsVB8nzpyGk-jPA6R3P8
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When envoy (or another upstream proxy) returns a 504 with body
upstream request timeoutbefore the request reaches snuba,_bulk_snuba_queryfails to parse it as JSON and raises a genericSnubaError("Failed to parse snuba error response"). This obscures the real failure mode — the request never reached snuba, so:Investigation context: issue 6969043325 is dominated by 504s from envoy in front of
snuba-api, almost all with the literal bodyupstream request timeout.This change introduces two new exception classes in
src/sentry/utils/snuba.py:InvalidSnubaResponseError— raised when a non-2xx response can't be decoded as JSON. Capturesstatusandbodyfor debugging.SnubaUpstreamRequestTimeout— subclass for 504 responses, or any status with the envoy-default bodyupstream request timeout.Both subclass
SnubaError, so existing handlers that catchSnubaErrorcontinue to work — this only refines what callers can choose to catch.It also wires
SnubaUpstreamRequestTimeoutintohandle_query_errorsinsrc/sentry/api/utils.pyalongsideQueryExecutionTimeMaximum/ReadTimeoutError, so it surfaces as the existingTimeoutException(HTTP 504) — the frontend already knows what to do with that.Test plan
tests/sentry/utils/test_snuba.py::SnubaInvalidJsonResponseTestcover:upstream request timeoutbody →SnubaUpstreamRequestTimeoutSnubaUpstreamRequestTimeout(defensive)no healthy upstream) →InvalidSnubaResponseErroronlytests/sentry/api/test_utils.py::HandleQueryErrorsTest::test_handle_snuba_upstream_request_timeoutconfirmsSnubaUpstreamRequestTimeoutis converted toTimeoutExceptionbyhandle_query_errorstests/sentry/utils/test_snuba.pyandtests/sentry/api/test_utils.py::HandleQueryErrorsTestpassprek run -qpasses on touched files