Stop large list pages from running an unbounded total-count query#1233
Stop large list pages from running an unbounded total-count query#1233Copilot wants to merge 4 commits into
Conversation
✅ Deploy Preview for antenna-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
✅ Deploy Preview for antenna-ssec ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
with_counts=true requests|
Claude says: Pushed a small fix to this branch in Fix: strip the ordering from the capped count subquery.
The order is irrelevant to a count, so Where the cap shines (and where it doesn't): on a genuinely large dense result set the cap is dramatic — an unbounded source-image The one thing to keep in mind for how we describe/use it: the cap bounds output rows, not input scanned, so it does nothing for a sparse/selective filter — e.g. a filter matching only ~0.7% of rows measured 64 ms uncapped vs 62 ms capped, because the |
|
Claude says: Pushed What changed: previously, when a result set exceeded the threshold the response returned Now, over the threshold the response returns the threshold as a lower-bound count plus a flag: { "count": 10000, "count_is_exact": false, "next": "...", "previous": "..." }So the existing UI shows "10,000" with a working pager today, and the only follow-up frontend change is cosmetic: append a "+" when Details:
Backend cost is unchanged — same
|
1a3b108 to
c7b8e0e
Compare
Every paginated list endpoint runs a COUNT(*) over the filtered result set to populate `count`. On large, densely-filtered tables that count can dominate the request even when the page query itself is fast. This bounds the worst case. - Counts stay exact up to COUNT_PRECISION_THRESHOLD (default 10,000). Beyond it the response returns the threshold as a lower bound with `count_is_exact: false`, which the UI renders as e.g. "10,000+", instead of scanning the whole table. - The capped count strips the queryset's ORDER BY first so the LIMIT can short-circuit instead of forcing a top-N sort that would scan the whole set anyway: `SELECT COUNT(*) FROM (SELECT 1 ... LIMIT N) sub`. - Callers can skip the total entirely with `?with_counts=false`, which returns `count: null` and runs no count query. - `next`/`previous` fall back to a one-extra-row probe whenever the count is inexact or skipped, preserving the pagination contract. Default behavior is unchanged for normal-sized result sets: an exact integer count with `count_is_exact: true`. New `count_is_exact` field documented in the OpenAPI schema. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
c7b8e0e to
942bfa3
Compare
There was a problem hiding this comment.
Pull request overview
This PR updates the project-wide DRF paginator to avoid unbounded COUNT(*) queries on large, densely-filtered list endpoints by capping count precision (exact up to a threshold, then returning the threshold as a lower bound) and by allowing callers to opt out of counts entirely via ?with_counts=false. It also adds a count_is_exact field to help API consumers distinguish exact vs capped vs skipped totals, and introduces tests to validate the new behaviors.
Changes:
- Implement capped counting and
with_counts=falseopt-out inLimitOffsetPaginationWithPermissions, plus probe-basednext/previouslogic when totals are inexact/skipped. - Extend the paginated response shape with
count_is_exactand markcountnullable in the response schema. - Add API tests covering exact counts, capped/inexact counts, opt-out behavior, and navigation links.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
ami/base/pagination.py |
Adds capped-count + opt-out logic to the default paginator and extends response/schema with count_is_exact. |
ami/main/tests.py |
Adds TestPaginationWithCounts to validate exact/capped/skipped counts and next/previous behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| capped = self._get_capped_count(queryset) | ||
| if capped is self._OVER_CAP: | ||
| # Over the precision cap: report the threshold as an approximate | ||
| # lower bound. It must not drive next/previous (the true total is | ||
| # higher), so fall back to the probe-based links. | ||
| self.count = self.COUNT_PRECISION_THRESHOLD |
| def get_paginated_response_schema(self, schema): | ||
| paginated_schema = super().get_paginated_response_schema(schema) | ||
| # count is the exact total, the precision cap (a lower bound), or null | ||
| # when the caller passed with_counts=false. | ||
| paginated_schema["properties"]["count"]["nullable"] = True | ||
| paginated_schema["properties"]["count_is_exact"] = { | ||
| "type": "boolean", | ||
| "nullable": True, | ||
| "description": ( | ||
| "True when `count` is exact; false when it is the precision cap " | ||
| '(a lower bound, render as e.g. "10,000+"); null when the count ' | ||
| "was skipped via with_counts=false." | ||
| ), | ||
| } | ||
| return paginated_schema |
…on.get_count
The capped count is computed over a stripped queryset (ordering removed,
projection narrowed to the primary key) via a new `_count_queryset` seam.
An unsliced COUNT(*) already drops the correlated-subquery annotations the
list orderings add (e.g. `last_processed` on captures), but the LIMIT used
for the precision cap would otherwise re-project them and run the subquery
per scanned row. Counting `values("pk")` keeps the COUNT over a bare
primary-key scan.
This also makes `ProjectPagination.get_count` redundant: the per-view
override existed only to strip those annotations before counting, which the
base paginator now does for every endpoint. Removed it.
Verified on a database snapshot (~2.88M captures in one project): the count
query stays ~5 ms whether or not the annotations are present, and EXPLAIN
confirms the detection subquery is not scanned.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
When the API caps the total count (count_is_exact: false on large result sets), surface it through the high-volume list hooks (captures, occurrences, species, sessions) and render the total as e.g. "10000+" in the pagination info label. The numbered page buttons still derive from the capped total, so pages beyond the cap are not reachable from the bar until these lists move to cursor pagination; this change keeps the label honest in the meantime. totalIsExact is an optional prop defaulting to true, so list views not wired to it (small tables that never reach the cap) are unchanged. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Summary
Every paginated list endpoint runs a
COUNT(*)over the filtered result set on each request to populate thecountfield. On large, densely-filtered tables that count can dominate the request: the page of rows is fast and indexed, but counting every matching row is not. This PR makes that cost predictable. The count stays exact up to a threshold (default 10,000); beyond it the endpoint returns the threshold as a lower bound flagged as inexact, so the UI can render "10,000+" instead of the server scanning the whole table. Callers that don't need a total at all can pass?with_counts=falseto skip the count query entirely.The default response stays backward-compatible: small result sets return the same exact integer
countthey always did. The only addition is acount_is_exactfield alongsidecount.This is one lever in the wider list-performance effort. It bounds the worst case of the count query; it is not a speedup for every endpoint (see "What we still need to verify").
List of Changes
COUNT(*); the count's cost is bounded to roughly the threshold regardless of table size.queryset.order_by()[:N+1].count(), which Postgres plans asSELECT COUNT(*) FROM (SELECT 1 … LIMIT N) sub. The list view'sORDER BYis stripped first so theLIMITcan short-circuit instead of forcing a top-N sort that would scan the whole set anyway.count = 10000withcount_is_exact: false; the UI renders this as e.g. "10,000+".count_is_exactflag; an_OVER_CAPsentinel distinguishes "exactly N rows" from "more than N".?with_counts=false— no count query runs at all.count: nullandcount_is_exact: null.next/previouslinks stay correct when there is no reliable total.nextlink, instead of deriving links fromcount.count_is_exact(true/false/null) documented in the OpenAPI schema;countmarked nullable.Behavior and compatibility:
countwithcount_is_exact: true.COUNT_PRECISION_THRESHOLD(default 10,000) to reflect that it bounds count precision, not merely "large querysets".nullis reserved for the explicitwith_counts=falseopt-out; over-the-cap responses return the capped lower bound, notnull.All changes are in
ami/base/pagination.py(LimitOffsetPaginationWithPermissions, the project-wide default paginator). Tests inami/main/tests.py::TestPaginationWithCounts.What we still need to verify
django-cachalotstill caches the capped-count subquery on the default path.mainfor normal-sized result sets.Frontend follow-up (not in this PR)
count: nulland render the lower bound whencount_is_exact: false(e.g. "10,000+"). Until that lands, the server default stayswith_counts=trueso existing components keep working unchanged.Test plan
ami/main/tests.py::TestPaginationWithCounts(8 tests): exact count below the cap, exact at the boundary, capped + inexact over the cap,with_counts=falsenull opt-out, andnext/previouson first/middle/last pages. 8/8 pass.Update — query cleanup + minimal UI (after review + measurement)
Two follow-ups were added after a structural review and a measurement pass on a dev environment loaded with a production-sized data snapshot (one project with ~2.9M captures).
Query correctness (
ami/base/pagination.py,ami/main/api/views.py): the capped count is now computed over a stripped queryset (ordering removed, projection narrowed to the primary key) via a_count_querysetseam. An unslicedCOUNT(*)already drops the correlated-subquery annotations the list orderings add (e.g.last_processedon captures), but theLIMITused for the cap would otherwise re-project them. This also makes the old per-viewProjectPagination.get_countoverride redundant, so it was removed and folded into the base paginator.Minimal UI handling of the capped count: the four high-volume list views (captures, occurrences, species, sessions) now surface
count_is_exactand render a capped total as e.g. "10000+" in the pagination info label.totalIsExactis an optional prop defaulting to true, so other lists are unchanged. The numbered page buttons still derive from the capped total, so pages beyond the cap are not reachable from the bar until these lists move to cursor pagination (tracked separately) — a deliberate, documented limitation.Measurements (single-run, dev bench with a production snapshot — not production APM)
EXPLAIN ANALYZE)Caveats worth knowing before relying on this:
django-cachalot); warm counts are already fast. The cap helps the cold / cache-miss path, which dominates while a project is actively ingesting (writes invalidate the count cache). On a quiet table the cap rarely triggers.