Framework for admins to trigger and review post-processing methods by mihow · Pull Request #1289 · RolnickLab/antenna

mihow · 2026-05-01T22:08:51Z

Summary

Establishes the shared pattern for running post-processing filters from the Django admin. Post-processing filters adjust or clean up data after ML processing; the first one (a small-size detection filter) ships here as the worked example. Until now there was no consistent way to find, configure, or run them — you edited code or called them by hand.

With this change an operator can, from the Django admin:

See every registered post-processing filter
Trigger one, and get an intermediate form to configure its options (the filter's arguments) before it runs
Have the run wrapped in a Job, so it appears in the jobs list with progress and logs like any other ML task
Trigger a filter on a single occurrence from the Occurrence admin — the fast spot-check path for developing and tuning a filter without processing a whole capture set
See run metrics on the job (detections_checked, detections_flagged, occurrences_updated), so a run's effect is visible without reading the log

This is a scaffold/precursor PR: it sets the approach every future post-processing method plugs into (class masking #999, tracking #1272, rank rollup, …), rather than shipping one large feature on its own. The size filter is included to prove the pattern end to end. The original scope was pure scaffolding plus a refactor of the existing small-size-filter trigger to expose its already-coded size_threshold knob; during review the per-occurrence trigger and stage metrics were added on top as small, migration-free extensions of the same pattern.

List of Changes

Change (user-effect)	How (implementation)
Operators see every registered post-processing filter as an admin action	Admin-trigger pattern on `BasePostProcessingTask`; tasks registered in `ami/ml/post_processing/registry.py`
Each filter declares its options as a typed schema; bad inputs are rejected before any job runs	Pydantic v1 `config_schema` contract, validated at task `__init__` (`extra="forbid"`)
Configure a filter's options in an intermediate form before it runs	`BasePostProcessingActionForm` with `to_config()` → dict contract + parameterized confirmation template (`admin/post_processing/confirmation.html` + `_form_fieldset.html` partial)
Runs are tracked as Jobs with progress and logs	`to_config()` result feeds `Job.params['config']`
Small-size filter's threshold is now adjustable from the admin (was hardcoded)	Migrated `SmallSizeFilterTask`: `SmallSizeFilterConfig` schema + `SmallSizeFilterActionForm` (`0 < x < 1` validation); same `0.0008` default preserved
A filter can be run on one occurrence, not just a whole capture set	Discriminated scope on the config (`source_image_collection_id` XOR `occurrence_id`, enforced by a root validator) + `OccurrenceAdmin` action using the same factory with `scope_resolver=lambda occ: {"occurrence_id": occ.pk}`; `SmallSizeFilterTask._scoped_detections()` resolves the detection queryset from whichever scope is set
Operators see what a run examined and changed, on the job's detail page	`BasePostProcessingTask.report_stage_metrics()` writes named stage params via `JobProgress.add_or_update_stage_param`; falls back to a log line when running jobless
Covered by tests	Tests across `ami/ml/post_processing/tests/` and `ami/ml/tests.py::TestPostProcessingTasks`, plus the admin changelist tests below, kept cheap for CI (see Test plan)

Admin niceties for reviewing post-processing

Purpose: make the Django admin a practical place to review and test post-processing on a production-sized database (millions of detections and classifications). Together these support the full review loop an operator runs when checking a post-processing method: find an occurrence, run a post-processing filter on it, click through to the detections and classifications it produced to inspect exactly what changed (including the applied_to provenance chain), optionally edit a classification by hand to probe how it moves the result, then recompute the occurrence's determination to see the effect — all without leaving the admin or looking up ids by hand.

Before these changes that loop was impractical: the occurrence/detection/classification changelists timed out (or ran the database out of memory) on real data, the classification change page stalled rendering preloaded dropdowns, there were no links between an occurrence and its detections/classifications, and nothing recomputed an occurrence's determination after a manual classification edit. Each change below removes one of those blockers. They share no code with the framework above and are independent of it.

Change (user-effect)	How (implementation)
`occurrences_updated` counts only occurrences whose determination actually changed	Capture the determination before the save and compare after, instead of counting every re-saved occurrence (`small_size_filter.py`)
After running a filter, the admin success message links to the created Job(s)	`format_html` link to each job's admin change page, so progress and any failure reason are one click away
The Occurrence and Detection changelists load in well under a second on millions of rows (were ~15s, and the detection list could error out of memory)	Count related rows with a correlated subquery instead of `Count(...)` + `GROUP BY`, and order by the indexed `-id` instead of an unindexed `-created_at`
Search occurrences and detections by id (detections also by source-image path)	`get_search_results` treats an all-digit term as an exact id lookup
The Classification change page opens quickly instead of stalling	FK fields (taxon, detection, algorithm, category_map, applied_to) render as autocomplete widgets instead of `<select>`s preloaded with every row
The Classification changelist no longer transfers thousands of floats per row	`scores`/`logits` counted in SQL (`cardinality`) with the arrays deferred
Navigate occurrence → detection → classification by clicking through	`show_change_link` on the detection and classification inlines
Recompute an occurrence's determination after editing its classifications by hand	New `OccurrenceAdmin` action — editing classifications doesn't trigger the recompute that Occurrence/Identification saves do

Adding a new post-processing filter

Three pieces, then one wiring line per admin entry point. Using the upcoming rank rollup and class masking (#999) filters as examples:

Task — subclass BasePostProcessingTask in ami/ml/post_processing/<your_filter>.py; declare key, name, and a pydantic config_schema; implement run(). Call self.update_progress(...) and self.report_stage_metrics({...}) as it works. Add the class to the POSTPROCESSING_TASKS dict in ami/ml/post_processing/registry.py.
Config schema — every knob and scope id the task accepts, e.g. RankRollupConfig(source_image_collection_id, target_rank, min_score) or ClassMaskingConfig(source_image_collection_id | occurrence_id, taxa_list_id). Scope ids follow the discriminated pattern shipped here: declare each as optional and enforce exactly-one with a root validator (copy SmallSizeFilterConfig).
Form — subclass BasePostProcessingActionForm in ami/ml/post_processing/admin/<your_filter>_form.py, declaring only the tunable fields — scope ids come from the selected admin rows, not the form. Class masking's taxa-list picker becomes a ModelChoiceField here. Value bounds belong in the schema, not the form.

Then wire it into each admin it should be triggerable from:

# ami/main/admin.py — e.g. on SourceImageCollectionAdmin
run_rank_rollup = make_post_processing_action(
    RankRollupTask,
    RankRollupActionForm,
    scope_resolver=lambda collection: {"source_image_collection_id": collection.pk},
)
actions = [..., run_rank_rollup]

scope_resolver states what each selected row contributes to the config; the form supplies the tunables. An occurrence-scoped filter (class masking) wires the same factory onto OccurrenceAdmin with scope_resolver=lambda occ: {"occurrence_id": occ.pk}. A task whose row→job mapping isn't 1:1 (tracking partitions events across projects, #1272) passes a custom build_jobs callable instead of scope_resolver. The confirmation page, validation, job creation, and metrics display all come from the shared scaffolding — no per-filter admin code beyond the lines above.

Out of scope (deferred to follow-ups)

Project-partitioning helper (defer to whichever multi-scope adopter lands first — likely feat(tracking): revive occurrence tracking as a post-processing task #1272's tracking, which partitions events across projects)
REST API surface (eventual primary trigger; admin not future-primary)
Rank rollup, class masking, tracking — they stay in their own PRs
Surfacing a run's parameters in the occurrence prediction history. Two complementary directions, both deferred to keep this PR migration-free: a Job FK on Classification/Detection (existing ticket PSv2: add Job relationship to a Classification and Detection objects #1156, where post-processing tasks would set classification.job), and a flexible JSON metadata field on Classification for pipeline-specific extras and per-classification transformation notes (needs its own ticket)

Rebase impact for downstream PRs

feat(tracking): revive occurrence tracking as a post-processing task #1272 (tracking, mine): smaller diff after rebase. admin_forms.py → admin/tracking_form.py. Form gains to_config(). Adds TrackingConfig schema. Per-project partition loop becomes a custom build_jobs.
Let admins mask classifier predictions to a species list #999 (class masking, @mohamedelabbas1996): hand-rolled HTML <select> becomes ClassMaskingActionForm(BasePostProcessingActionForm) with ModelChoiceFields. Adds ClassMaskingConfig + RankRollupConfig schemas. The occurrence-scoped discriminated config shipped here is the template to copy. Will message author once this lands.

Smoke test (manual, against live stack)

End-to-end validated against project 18 capture set 175 (10 images):

Action dropdown lists "Run Small size filter post-processing task (async)" ✓
Click Go → renders intermediate confirmation page (template admin/post_processing/confirmation.html rendered) ✓
Form shows size_threshold field with default 0.0008 and help text ✓

Submit → redirect with success message; Job 1571 created and enqueued; celeryworker picked up; ran to SUCCESS ✓
Job.params payload validated typed: {'task': 'small_size_filter', 'config': {'size_threshold': 0.0008, 'source_image_collection_id': 175}} ✓

The occurrence-scope trigger and stage metrics are covered by the automated tests below, and the branch is deployed on a shared dev box for hands-on review.

Test plan

test_base_schema.py — subclass without schema raises, abstract intermediaries exempt, valid config builds BaseModel instance, defaults apply, scope rules (occurrence accepted / both raise / neither raises), out-of-range threshold raises, unknown keys rejected (extra="forbid")
test_admin_form.py — to_config() round-trip, type coercion at the form layer, range enforcement delegated to the schema
test_action_factory.py — action naming and labels, one Job per row with task key/scope/project FK, schema errors mapped to form fields, mid-creation failure rolls back all jobs, build_jobs override hook
test_small_size_filter_admin.py — intermediate page renders (no Job created), select_across refused, invalid threshold re-renders with error, valid POST creates Job with config, occurrence-scoped POST sets occurrence_id (and not the collection id), multi-collection POST creates per-collection jobs with correct project FKs
ami/ml/tests.py::TestPostProcessingTasks — filter flags small detections and updates determinations, occurrence scope leaves sibling occurrences untouched, stage metrics land on the job's progress params
Manual smoke: trigger from Django admin against existing test data; verify Job runs to completion with custom threshold

A fixture-cost pass keeps CI time in check: the admin/factory tests build bare Project/Collection rows once per class (setUpTestData) instead of calling the full setup_test_project fixture per test, and the pure form tests run without the DB (SimpleTestCase). The module's 29 tests run in ~5 s locally; this also removed a double-fixture setup that intermittently hit a statement timeout in CI.

Design doc

docs/claude/planning/2026-05-01-post-processing-admin-scaffolding-design.md (committed in this branch).

Notes

Pydantic v1 syntax throughout (repo pins pydantic<2.0).
Pyright will surface __fields__ and validator as deprecated in some envs — these come from local v2 stubs; container is v1 and the syntax is correct.

Summary by CodeRabbit

New Features
- Two-step admin post-processing action for the Small Size Filter with confirmation UI, per-selection scoping (collection or occurrence), per-item job creation, and a size-threshold parameter validated server-side.
- Tasks now validate typed configs and can report stage metrics (detections checked/flagged, occurrences updated).
Tests
- Expanded coverage for forms, schema contract, admin confirmation flow, multi-selection job creation, transactional safety, and metrics.
Documentation
- Design doc describing the post-processing admin scaffolding and migration plan.

Context: PRs #999 (class masking) and #1272 (tracking) each grew their own admin-trigger plumbing (registry registration, intermediate confirmation page with form, freeform dict config). This precursor extracts the shared pattern using SmallSizeFilterTask as the migration consumer — proves the abstraction without carving up another contributor's open PR. Spec covers: pydantic config_schema contract on BasePostProcessingTask, BasePostProcessingActionForm class, parameterized confirmation template + partial, SmallSizeFilterTask migration (exposes existing size_threshold knob via admin), test plan, rebase impact for #999 + #1272. Co-Authored-By: Claude <noreply@anthropic.com>

netlify · 2026-05-01T22:08:55Z

✅ Deploy Preview for antenna-preview canceled.

Name	Link
🔨 Latest commit	`cdbb4e4`
🔍 Latest deploy log	https://app.netlify.com/projects/antenna-preview/deploys/6a3dcf1f1542fb00088f7d1c

netlify · 2026-05-01T22:08:55Z

✅ Deploy Preview for antenna-ssec canceled.

Name	Link
🔨 Latest commit	`cdbb4e4`
🔍 Latest deploy log	https://app.netlify.com/projects/antenna-ssec/deploys/6a3dcf1fa16cdc0008f74cc8

coderabbitai · 2026-05-01T22:08:58Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a Pydantic config contract for post-processing tasks, a shared admin action factory with a two-step confirmation flow, typed SmallSizeFilter config/task, admin forms and templates, wiring in ami/main/admin.py, and comprehensive tests.

Changes

Post-Processing Admin Scaffolding

Layer / File(s)	Summary
BasePostProcessingTask schema contract `ami/ml/post_processing/base.py`, `ami/ml/post_processing/tests/test_base_schema.py`	`BasePostProcessingTask` requires `config_schema` (Pydantic `BaseModel`) and constructs `self.config = self.config_schema(**config)` during init; logs serialize via `self.config.dict()`. Tests validate contract and error paths.
SmallSizeFilterConfig and task schema binding `ami/ml/post_processing/small_size_filter.py`, `ami/ml/tests.py`	Adds `SmallSizeFilterConfig` (size_threshold in (0,1), `extra="forbid"`) and sets `SmallSizeFilterTask.config_schema`; `run()` reads typed config fields, resolves scoped detections via `_scoped_detections()`, and reports `detections_checked`, `detections_flagged`, and `occurrences_updated`.
Admin form base and SmallSizeFilter form `ami/ml/post_processing/admin/forms.py`, `ami/ml/post_processing/admin/small_size_filter_form.py`, `ami/ml/post_processing/tests/test_admin_form.py`	`BasePostProcessingActionForm.to_config()` maps `cleaned_data` → config dict. `SmallSizeFilterActionForm` exposes `size_threshold: FloatField` with initial value taken from `SmallSizeFilterConfig`. Tests cover cleaned output, initial defaults, non-numeric rejection, and passthrough behavior.
Confirmation page and form fieldset templates `ami/templates/admin/post_processing/_form_fieldset.html`, `ami/templates/admin/post_processing/confirmation.html`	New admin templates: a reusable `_form_fieldset.html` that renders fields, labels, help text, and errors, and `confirmation.html` for the two-step confirmation form with CSRF, hidden selection fields, and submit/cancel controls.
Shared admin action factory and admin wiring `ami/ml/post_processing/admin/actions.py`, `ami/main/admin.py`, `ami/ml/post_processing/tests/test_action_factory.py`	Introduces `make_post_processing_action` (with `default_build_jobs`, pydantic→form error mapping, atomic job creation, and confirmation rendering). `SourceImageCollectionAdmin.run_small_size_filter` and `OccurrenceAdmin.run_small_size_filter` are replaced/added as factory-created actions bound to `SmallSizeFilterTask` and `SmallSizeFilterActionForm`. Tests cover registration, default behavior, error mapping/atomicity, and the `build_jobs` override hook.
Form and admin integration tests `ami/ml/post_processing/tests/test_admin_form.py`, `ami/ml/post_processing/tests/test_small_size_filter_admin.py`, `ami/ml/tests.py`	Adds tests for admin form typing/cleaning and comprehensive admin action flows: intermediate confirmation rendering, validation error re-rendering with no job creation, confirmed job creation with correct `params` and per-job `project_id`, multi-collection job creation, occurrence-scoped runs, and task run metrics reporting.
Design specification `docs/claude/planning/2026-05-01-post-processing-admin-scaffolding-design.md`	Design doc describing the Pydantic schema contract, admin form contract, confirmation template/flow, refactor into `make_post_processing_action(...)`, and associated tests and rebase notes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

backend, ml, data-processing

Suggested reviewers

loppear
annavik

Poem

A rabbit hops through admin halls so grand,
With Pydantic schemas held in paw,
Forms and confirmations tidy the land,
Jobs queued neatly without a flaw.
Hooray—small filters march in order now! 🐰

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 19.70% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: establishing a framework for admins to trigger and review post-processing methods, which is the core feature added across all the modified and new files.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The PR description is well structured and covers summary, changes, testing, screenshots, and deployment notes, with only minor template sections left incomplete.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/post-processing-admin-scaffolding

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

Adds the shared infrastructure that PRs #999 (class masking) and #1272 (tracking) each grew independently. Uses SmallSizeFilterTask as the migration consumer to prove the pattern without carving up another open PR. Changes: - BasePostProcessingTask gains required pydantic config_schema class attr; validates Job.params['config'] at task __init__ via self.config_schema(**config). self.config is now a BaseModel instance (was: freeform dict). - SmallSizeFilterTask: adds SmallSizeFilterConfig schema (size_threshold + source_image_collection_id, validators for 0<x<1, extra=forbid). Existing default 0.0008 preserved. - New admin form base BasePostProcessingActionForm with to_config() contract, plus SmallSizeFilterActionForm exposing size_threshold knob (currently unreachable — admin trigger hardcoded empty config). - Parameterized confirmation template at admin/post_processing/confirmation.html with overridable {% block intro %} and a _form_fieldset.html partial. - run_small_size_filter admin action rewrites onto new pattern: intermediate confirmation page on first POST, validate + enqueue on confirm. Per-collection Job creation preserved (each Job gets correct project FK). - 17 tests covering schema contract, form validation, intermediate page render, multi-collection partitioning by project FK. Pydantic v1 syntax throughout (repo pins pydantic<2.0). Memory note: container is v1, .dict() / .__fields__ are correct here. Out of scope (explicitly deferred): - Project-partitioning helper (defer to whichever multi-scope adopter lands first, likely #1272's tracking which partitions events across projects) - REST API trigger surface (eventual primary; admin not future-primary) - Rank rollup, class masking, tracking tasks themselves — stay in their PRs Design doc: docs/claude/planning/2026-05-01-post-processing-admin-scaffolding-design.md Co-Authored-By: Claude <noreply@anthropic.com>

Confirmation page (intermediate page rendering with size_threshold form field) and success message (after submitting form, Job 1571 enqueued with typed config payload). Co-Authored-By: Claude <noreply@anthropic.com>

Screenshots for the PR description now live in the ami-media-staging object store rather than docs/claude/screenshots/. Keeps binary assets out of the repo history. Co-Authored-By: Claude <noreply@anthropic.com>

Copilot

Pull request overview

Establishes a reusable Django-admin pattern for triggering ML post-processing tasks with an intermediate configuration form, validates config via Pydantic schemas, and migrates the existing Small Size Filter to the new framework (including admin-adjustable size_threshold and new tests).

Changes:

Add a config_schema (Pydantic v1) contract to BasePostProcessingTask and validate task config at construction time.
Introduce a base admin action form + shared confirmation templates for rendering/confirming post-processing parameters.
Migrate the Small Size Filter admin action to the new confirmation + schema-validated config flow and add test coverage for the new behavior.

Reviewed changes

Copilot reviewed 11 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
docs/claude/planning/2026-05-01-post-processing-admin-scaffolding-design.md	Design doc describing the new admin/scaffolding pattern and migration plan.
ami/ml/post_processing/base.py	Enforces `config_schema` and validates config via Pydantic on task initialization.
ami/ml/post_processing/small_size_filter.py	Adds `SmallSizeFilterConfig` schema and switches task to typed config access.
ami/ml/post_processing/admin/forms.py	Adds `BasePostProcessingActionForm` with a `to_config()` contract.
ami/ml/post_processing/admin/small_size_filter_form.py	Adds `SmallSizeFilterActionForm` exposing `size_threshold` in admin.
ami/templates/admin/post_processing/confirmation.html	Adds shared intermediate confirmation template for admin actions.
ami/templates/admin/post_processing/_form_fieldset.html	Adds shared partial to render form fields + errors.
ami/main/admin.py	Rewrites `run_small_size_filter` to render intermediate form and enqueue schema-validated jobs.
ami/ml/post_processing/tests/test_base_schema.py	Tests `config_schema` enforcement and schema validation behavior.
ami/ml/post_processing/tests/test_admin_form.py	Tests base form `to_config()` and size threshold validation.
ami/ml/post_processing/tests/test_small_size_filter_admin.py	Tests admin confirmation flow + job creation/config payload behavior.
ami/ml/post_processing/admin/init.py	Creates admin module package.
ami/ml/post_processing/tests/init.py	Creates tests package.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Address review feedback on the repetitive run_/render pattern. The confirm -> render -> validate -> enqueue flow is now built by make_post_processing_action() in ami/ml/post_processing/admin/actions.py, so each task declares only what varies: its task class, knob form, and how a selected row maps to a Job (scope_resolver / project_resolver / name_resolver). Tasks whose row->Job mapping isn't one-Job-per-row (e.g. PR #1272's per-project event partitioning) pass a custom build_jobs callable; PR #999 likewise inherits the form/template instead of hand-rolling HTML. Config validation now has a single source of truth: the task's pydantic config_schema. The knob form no longer re-encodes the (0, 1) bound, and the FloatField min/max that contradicted the exclusive interval is gone. Schema errors raised while building Jobs are mapped back onto the form so the operator sees them inline on the confirmation page (non-field errors now render too). This also removes the near-dead per-row try/except in the old admin action. Adds test_action_factory.py (action registration, default builder, schema mapping, all-or-nothing, build_jobs override hook) and updates the form tests to reflect that range enforcement moved to the schema layer. Co-Authored-By: Claude <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

ami/ml/post_processing/base.py (1)

30-35: ⚡ Quick win

Enforce config_schema type, not just presence.

__init_subclass__ currently allows non-Pydantic values for config_schema; that fails later in __init__ instead of at class definition time.

Proposed tightening

     def __init_subclass__(cls, **kwargs):
         super().__init_subclass__(**kwargs)
         required_attrs = ["key", "name", "config_schema"]
         for attr in required_attrs:
             if not hasattr(cls, attr) or getattr(cls, attr) is None:
                 raise TypeError(f"{cls.__name__} must define '{attr}' class attribute")
+        if not isinstance(cls.config_schema, type) or not issubclass(
+            cls.config_schema, pydantic.BaseModel
+        ):
+            raise TypeError(
+                f"{cls.__name__}.config_schema must be a subclass of pydantic.BaseModel"
+            )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/ml/post_processing/base.py` around lines 30 - 35, The current
__init_subclass__ only checks presence of "config_schema" but not its type;
update __init_subclass__ to also validate that cls.config_schema is a class/type
and a subclass of pydantic.BaseModel (i.e. import pydantic.BaseModel and use
isinstance/issubclass checks), and raise a TypeError if it's missing or not a
Pydantic model; keep the existing checks for "key" and "name" and reference the
same attribute names (config_schema, key, name) and the __init_subclass__ method
when making the change.

ami/templates/admin/post_processing/_form_fieldset.html (1)

1-8: ⚡ Quick win

Render form.non_field_errors before field rows (future-proofing the shared partial).

ami/templates/admin/post_processing/_form_fieldset.html renders only per-field field.errors; it never outputs form.non_field_errors. For the current SmallSizeFilterActionForm, validation happens in clean_size_threshold() (field-level), so operators already see errors via field.errors—but this partial will hide non-field errors if a form later adds clean() / add_error(None, ...)-style validation.
Suggested patch
+{% if form.non_field_errors %}
+  {% for error in form.non_field_errors %}<p class="errornote">{{ error }}</p>{% endfor %}
+{% endif %}
 {% for field in form %}
   <div class="form-row">
     {{ field.label_tag }}
     {{ field }}
     {% if field.help_text %}<p class="help">{{ field.help_text }}</p>{% endif %}
     {% for error in field.errors %}<p class="errornote">{{ error }}</p>{% endfor %}
   </div>
 {% endfor %}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/templates/admin/post_processing/_form_fieldset.html` around lines 1 - 8,
The partial currently only renders per-field errors; modify
ami/templates/admin/post_processing/_form_fieldset.html to output
form.non_field_errors (e.g., {{ form.non_field_errors }}) before the for loop so
any form-level validation added via clean() or add_error(None, ...) (as might
happen in SmallSizeFilterActionForm) will be visible to operators; ensure
non-field errors render before the field rows for future-proofing.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ami/ml/post_processing/admin/small_size_filter_form.py`:
- Around line 18-20: Replace the Unicode multiplication sign (`×`) in the help
string "Minimum bounding-box area as a fraction of the source image area (width
× height). Detections smaller than this are flagged as 'Not identifiable'.
Default 0.0008 ≈ 0.08% of frame area." with a plain ASCII "x" so the text reads
"(width x height)"; update that literal help text in small_size_filter_form.py
(the same string shown in the diff) to satisfy the Ruff RUF001 lint rule.

In `@ami/ml/post_processing/tests/test_small_size_filter_admin.py`:
- Around line 18-21: Replace the hardcoded password string used in
User.objects.create_superuser by avoiding a literal: call
User.objects.create_superuser(email="ssfadmin@example.com", password=None) (or
any non-literal), then immediately call self.superuser.set_unusable_password()
and self.superuser.save() to ensure no secret literal remains; update references
to self.superuser accordingly.

In `@docs/claude/planning/2026-05-01-post-processing-admin-scaffolding-design.md`:
- Around line 50-70: The fenced code block that lists the project tree starting
with "ami/ml/post_processing/" is missing a language tag; update the opening
triple-backtick to include a language identifier (e.g., ```text) so the block
becomes ```text and the closing backticks remain, ensuring markdownlint MD040 is
satisfied for the tree block in the document.

---

Nitpick comments:
In `@ami/ml/post_processing/base.py`:
- Around line 30-35: The current __init_subclass__ only checks presence of
"config_schema" but not its type; update __init_subclass__ to also validate that
cls.config_schema is a class/type and a subclass of pydantic.BaseModel (i.e.
import pydantic.BaseModel and use isinstance/issubclass checks), and raise a
TypeError if it's missing or not a Pydantic model; keep the existing checks for
"key" and "name" and reference the same attribute names (config_schema, key,
name) and the __init_subclass__ method when making the change.

In `@ami/templates/admin/post_processing/_form_fieldset.html`:
- Around line 1-8: The partial currently only renders per-field errors; modify
ami/templates/admin/post_processing/_form_fieldset.html to output
form.non_field_errors (e.g., {{ form.non_field_errors }}) before the for loop so
any form-level validation added via clean() or add_error(None, ...) (as might
happen in SmallSizeFilterActionForm) will be visible to operators; ensure
non-field errors render before the field rows for future-proofing.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 98c48d31-30bc-478f-af01-8862514274eb

📥 Commits

Reviewing files that changed from the base of the PR and between e3f662d and 93bd529.

📒 Files selected for processing (13)

ami/main/admin.py
ami/ml/post_processing/admin/__init__.py
ami/ml/post_processing/admin/forms.py
ami/ml/post_processing/admin/small_size_filter_form.py
ami/ml/post_processing/base.py
ami/ml/post_processing/small_size_filter.py
ami/ml/post_processing/tests/__init__.py
ami/ml/post_processing/tests/test_admin_form.py
ami/ml/post_processing/tests/test_base_schema.py
ami/ml/post_processing/tests/test_small_size_filter_admin.py
ami/templates/admin/post_processing/_form_fieldset.html
ami/templates/admin/post_processing/confirmation.html
docs/claude/planning/2026-05-01-post-processing-admin-scaffolding-design.md

Addresses review feedback. When an operator uses Django admin's "select all across pages", the action receives the entire filtered table as its queryset, which would serialize every pk into hidden _selected_action inputs on the confirmation page (a very large POST body, potentially over request limits). This trigger is for explicit, bounded selections, so the across-pages case is now refused with a clear message instead of rendering an unbounded form. Also folds in two cosmetic review nits: replace the Unicode multiplication sign in the size-threshold help text with a plain "x", and add a language tag to the module-layout code fence in the design doc. Co-Authored-By: Claude <noreply@anthropic.com>

mihow · 2026-06-05T01:06:58Z

@coderabbitai review

Note: the previous automated review covered commit 93bd5299, which predates the controller-abstraction refactor. The current head (9b4b7b37) moves the admin trigger into a factory at ami/ml/post_processing/admin/actions.py, single-sources config validation on the pydantic schema, and adds a select_across guard. Re-reviewing against head so the findings reflect the code that will actually merge.

coderabbitai · 2026-06-05T01:09:26Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
echo "=== ami/ml/post_processing/admin/actions.py ===" 
cat ami/ml/post_processing/admin/actions.py 2>/dev/null || echo "NOT FOUND"

echo ""
echo "=== ami/ml/post_processing/admin/forms.py ==="
cat ami/ml/post_processing/admin/forms.py 2>/dev/null || echo "NOT FOUND"

echo ""
echo "=== ami/ml/post_processing/admin/small_size_filter_form.py ==="
cat ami/ml/post_processing/admin/small_size_filter_form.py 2>/dev/null || echo "NOT FOUND"

echo ""
echo "=== ami/ml/post_processing/base.py ==="
cat ami/ml/post_processing/base.py 2>/dev/null || echo "NOT FOUND"

echo ""
echo "=== ami/ml/post_processing/small_size_filter.py ==="
cat ami/ml/post_processing/small_size_filter.py 2>/dev/null || echo "NOT FOUND"

echo ""
echo "=== ami/main/admin.py (post_processing section) ==="
grep -n "small_size_filter\|post_processing\|SmallSize\|select_across\|BasePostProcessing\|run_small_size" ami/main/admin.py | head -80

echo ""
echo "=== ami/main/admin.py full relevant section ==="
grep -n "" ami/main/admin.py | grep -A5 -B5 "small_size_filter\|SmallSizeFilter\|post_processing\|select_across" | head -120

echo ""
echo "=== ami/templates/admin/post_processing/confirmation.html ==="
cat ami/templates/admin/post_processing/confirmation.html 2>/dev/null || echo "NOT FOUND"

echo ""
echo "=== ami/templates/admin/post_processing/_form_fieldset.html ==="
cat ami/templates/admin/post_processing/_form_fieldset.html 2>/dev/null || echo "NOT FOUND"

echo ""
echo "=== ami/ml/post_processing/tests/ tree ==="
find ami/ml/post_processing/tests -type f | sort

echo ""
echo "=== ami/ml/post_processing/tests/test_admin_form.py ==="
cat ami/ml/post_processing/tests/test_admin_form.py 2>/dev/null || echo "NOT FOUND"

echo ""
echo "=== ami/ml/post_processing/tests/test_base_schema.py ==="
cat ami/ml/post_processing/tests/test_base_schema.py 2>/dev/null || echo "NOT FOUND"

echo ""
echo "=== ami/ml/post_processing/tests/test_small_size_filter_admin.py ==="
cat ami/ml/post_processing/tests/test_small_size_filter_admin.py 2>/dev/null || echo "NOT FOUND"

echo ""
echo "=== ami/ml/post_processing/__init__.py ==="
cat ami/ml/post_processing/__init__.py 2>/dev/null || echo "NOT FOUND"

echo ""
echo "=== ami/ml/post_processing/admin/__init__.py ==="
cat ami/ml/post_processing/admin/__init__.py 2>/dev/null || echo "NOT FOUND"

echo ""
echo "=== ami/ml/post_processing/tests/__init__.py ==="
cat ami/ml/post_processing/tests/__init__.py 2>/dev/null || echo "NOT FOUND"

Length of output: 38389

🏁 Script executed:

#!/bin/bash
echo "=== ami/ml/post_processing/tests/test_action_factory.py ==="
cat ami/ml/post_processing/tests/test_action_factory.py 2>/dev/null || echo "NOT FOUND"

echo ""
echo "=== ami/main/admin.py lines around run_small_size_filter (full context) ==="
sed -n '640,680p' ami/main/admin.py

echo ""
echo "=== Check for select_across in confirmation template ==="
grep -rn "select_across" ami/templates/ ami/ml/ ami/main/admin.py

echo ""
echo "=== Check if select_across POST value is preserved across re-renders ==="
grep -n "select_across" ami/ml/post_processing/admin/actions.py

echo ""
echo "=== Check Job.enqueue() signature ==="
ast-grep --pattern 'def enqueue($$$)'

echo ""
echo "=== Check if config_schema attribute check handles abstract intermediary correctly ==="
grep -n "__init_subclass__\|abc.ABC\|ABCMeta\|abstractmethod" ami/ml/post_processing/base.py

echo ""
echo "=== Check how SmallSizeFilterConfig.dict() serializes (pydantic v1 vs v2) ==="
grep -rn "pydantic" ami/ml/post_processing/ --include="*.py" | grep -v "test_\|\.pyc"

echo ""
echo "=== Check for any pydantic version pinning ==="
grep -rn "pydantic" requirements*.txt setup.cfg pyproject.toml 2>/dev/null | head -20

echo ""
echo "=== Check if confirm value is '1' or truthy in the action ==="
grep -n "confirm" ami/ml/post_processing/admin/actions.py

echo ""
echo "=== Check the select_across guard return value ==="
grep -n -A3 "select_across" ami/ml/post_processing/admin/actions.py

echo ""
echo "=== Check queryset.count() call in render_confirmation - N+1 risk ==="
grep -n "count\|values_list" ami/ml/post_processing/admin/actions.py

echo ""
echo "=== Check SmallSizeFilterTask name attribute (has space vs underscore for key) ==="
grep -n "^    key\|^    name\|^    config_schema" ami/ml/post_processing/small_size_filter.py

echo ""
echo "=== Check if test_base_schema tests hit the DB (Algorithm.objects.get_or_create) ==="
grep -n "get_or_create\|Algorithm" ami/ml/post_processing/base.py

echo ""
echo "=== Check if there is any transaction.atomic around job creation ==="
grep -n "atomic\|transaction" ami/ml/post_processing/admin/actions.py

echo ""
echo "=== Check what happens when build_jobs raises ConfigValidationErrors but form has errors too ==="
grep -n "add_error\|non_field_errors\|form.errors" ami/ml/post_processing/admin/actions.py

echo ""
echo "=== Check if 'name = Small size filter' has lowercase 's' which differs from action label 'Small Size Filter' ==="
grep -n "name\s*=\s*['\"]" ami/ml/post_processing/small_size_filter.py ami/ml/post_processing/admin/actions.py

Length of output: 15296

@mihow: The PR images show a clean confirmation page (Django admin "Run Small Size Filter" form with size_threshold: 0.0008) and a green success banner ("Queued Small Size Filter for 1 capture set(s). Jobs: [1571]"). The end-to-end UX is solid.

Here is the review against head (9b4b7b37).

Review of PR `#1289` — Post-Processing Admin Scaffolding

🔴 Important — `default_build_jobs` is not atomic

default_build_jobs in ami/ml/post_processing/admin/actions.py validates all rows first (good, all-or-nothing), but the creation loop is not wrapped in transaction.atomic():

for obj, model in validated:
    job = Job.objects.create(...)   # auto-commits immediately
    job.enqueue()                   # transaction.on_commit fires right away
    job_pks.append(job.pk)

Because Job.enqueue() uses transaction.on_commit, jobs committed before a mid-loop exception are already in the DB and will be dispatched to the queue. An operator selecting 10 collections gets jobs 1–7 running while jobs 8–10 were never created.

Suggested fix:

from django.db import transaction

# inside default_build_jobs, replace the creation loop with:
with transaction.atomic():
    job_pks: list[int] = []
    for obj, model in validated:
        job = Job.objects.create(
            name=name_resolver(task_cls, obj),
            project=project_resolver(obj),
            job_type_key="post_processing",
            params={"task": task_cls.key, "config": model.dict()},
        )
        job.enqueue()   # on_commit fires only after the atomic block commits
        job_pks.append(job.pk)
return job_pks

🟡 Medium — Pydantic v1 API with no version pin

The code uses pydantic v1-only surface:

@pydantic.validator (small_size_filter.py Line 13)
model.dict() (actions.py Line 119)
SmallSizeFilterConfig.__fields__["size_threshold"].default (small_size_filter_form.py Line 9)

No pydantic version constraint was found in requirements*.txt, setup.cfg, or pyproject.toml. If pydantic v2 is installed, these work under the compatibility shim but emit deprecation warnings. Consider either pinning pydantic>=1,<2 or migrating to v2 (@field_validator, .model_dump(), .model_fields).

🟡 Medium — `__init_subclass__` blocks abstract intermediary classes

BasePostProcessingTask.__init_subclass__ fires on every subclass, including abstract ones. Any future attempt to add a shared abstract mixin like:

class AbstractFilterTask(BasePostProcessingTask):
    """Common logic for size/shape filters."""
    # config_schema intentionally deferred to concrete subclass
    ...

will immediately raise TypeError. A simple guard:

def __init_subclass__(cls, **kwargs):
    super().__init_subclass__(**kwargs)
    import inspect
    if inspect.isabstract(cls):
        return
    for attr in ["key", "name", "config_schema"]:
        if not hasattr(cls, attr) or getattr(cls, attr) is None:
            raise TypeError(f"{cls.__name__} must define '{attr}' class attribute")

🟡 Medium — Two DB hits in `render_confirmation`

Lines 151–152 of actions.py issue two separate queries:

"selected_count": queryset.count(),
"selected_pks": [str(pk) for pk in queryset.values_list("pk", flat=True)],

One pass is enough:

selected_pks = [str(pk) for pk in queryset.values_list("pk", flat=True)]
...
"selected_count": len(selected_pks),
"selected_pks": selected_pks,

🟡 Medium — Custom `build_jobs` receives `scope_resolver=None`

When make_post_processing_action is called with only build_jobs (no scope_resolver), the factory still passes scope_resolver=None in the kwargs dict forwarded to the custom runner:

kwargs: dict[str, Any] = dict(
    ...
    scope_resolver=scope_resolver,   # None when build_jobs is provided
    ...
)
runner(**kwargs)

A custom build_jobs that blindly unpacks all kwargs and calls scope_resolver(obj) would get an obscure TypeError. The cleanest fix is to omit scope_resolver from kwargs when it is None, or document that custom implementations must use **kwargs and not call scope_resolver directly.

🟡 Medium — `TestBuildJobsOverrideHook` test doesn't actually invoke the action

test_custom_build_jobs_is_used_instead_of_default only checks action.__name__ and constructs the action — it never calls the returned callable or asserts calls was populated. The custom_build_jobs is never actually exercised:

action = make_post_processing_action(...)
self.assertEqual(action.__name__, "run_small_size_filter")
# ← no call to action(...) or assertion on `calls`

Consider adding a call through the admin client (like the other test classes do) or at minimum invoking action(mock_admin, mock_request, qs) and asserting calls was populated.

🟢 Low — `name` capitalisation inconsistency

SmallSizeFilterTask.name = "Small size filter" (lowercase 's') while the custom title="Run Small Size Filter" and description="Run Small Size Filter ..." in admin.py use title case. The job name produced by name_resolver will say "Small size filter" while the UI says "Small Size Filter". Not user-facing for now, but could confuse operators reading job logs. Aligning to name = "Small Size Filter" would keep everything consistent.

✅ What's working well

The factory pattern in make_post_processing_action cleanly separates task definition from admin wiring — new tasks are genuinely one declaration away.
The all-or-nothing schema validation before any Job.objects.create() is the right default.
select_across guard with a clear user-facing message is a good defensive touch.
_schema_errors_to_form_fields mapping pydantic error locs back to Django form fields is elegant and keeps the UX tight.
Test coverage across test_action_factory.py, test_small_size_filter_admin.py, test_base_schema.py, and test_admin_form.py is comprehensive.
Job.enqueue() already uses transaction.on_commit, so wrapping the creation loop in atomic() will make enqueues correctly deferred with zero extra effort.

Follow-up to the head-commit CodeRabbit re-review: - Wrap the Job-creation loop in default_build_jobs in transaction.atomic(). Admin requests are already atomic (ATOMIC_REQUESTS=True), but this helper can also be called outside a request (e.g. a management command), so the explicit block keeps creation all-or-nothing there too. Job.enqueue() uses transaction.on_commit, so the (millisecond) block only guards row creation; the long-running task still runs asynchronously in the worker after commit. - Guard BasePostProcessingTask.__init_subclass__ with inspect.isabstract so an abstract intermediary task class isn't forced to declare key/name/config_schema. - Resolve the selection once in render_confirmation (len of the materialized pk list) instead of a separate .count() query. - Omit scope_resolver from the build_jobs kwargs when it isn't set, so a custom build_jobs supplied without one never receives a None it might call. - Derive the admin action's labels (title, submit, dropdown description) from task.name instead of hardcoding title-case strings, so the operator-facing label and the Job name stay consistent and there are fewer per-task strings. - Make the build_jobs override-hook test actually invoke the action and assert the custom builder ran (and that scope_resolver was not forwarded as None). 24/24 post_processing tests green. Co-Authored-By: Claude <noreply@anthropic.com>

mihow · 2026-06-05T22:47:00Z

Claude says: Thanks — the re-review against head caught real issues. Addressed in abcb446e:

🔴 default_build_jobs not atomic — wrapped the creation loop in transaction.atomic(). Worth noting the dispatch-partial scenario doesn't actually arise in the admin path: ATOMIC_REQUESTS = True (config/settings/base.py:59) already wraps the whole admin request, so a mid-loop failure rolls back and the on_commit enqueues never fire. The explicit block is for callers outside a request (e.g. a future management command via feat(tracking): revive occurrence tracking as a post-processing task #1272's custom build_jobs), where there's no ambient transaction. Since enqueue() only registers an on_commit, the block guards row creation (ms) — the long task still runs async in the worker after commit.
🟡 __init_subclass__ blocks abstract intermediaries — guarded with inspect.isabstract(cls); concrete tasks still must declare key/name/config_schema.
🟡 Two DB hits in render_confirmation — selection resolved once; count is len() of the materialized pk list.
🟡 Custom build_jobs gets scope_resolver=None — scope_resolver is now omitted from the forwarded kwargs when unset.
🟢 name casing — labels (title, submit, dropdown) now derive from task.name rather than hardcoded title-case strings, so the operator-facing label and the Job name stay consistent (both "Small size filter"). Avoided renaming task.name itself since it's the lookup key in Algorithm.objects.get_or_create(name=...).
Test gap — TestBuildJobsOverrideHook now invokes the action through the confirm leg and asserts the custom builder ran (and that scope_resolver isn't forwarded as None).

One correction: the "pydantic v1, no version pin" finding is a false negative — it's pinned at requirements/base.txt:17 (pydantic<2.0 # ... django pydantic field), and the CI container runs 1.10.26. The search looked at top-level requirements*.txt and missed the requirements/ directory. The v1 syntax is correct and enforced.

24/24 post_processing tests green.

…tract guard End-of-PR test audit (separate from the TDD scaffolding): - Remove test_default_threshold_applies_when_form_uses_initial: size_threshold is a required field and can't be omitted, so it just re-POSTed the default value — identical mechanism to the existing valid-post test. The default is already covered by the form-initial and schema-default unit tests. - Rewrite the mislabeled "all-or-nothing" test into a real one. The old version gave every row the same out-of-range threshold, so both failed at validation and the creation loop was never reached. The new test passes validation, forces a failure while creating the second Job, and asserts the first is rolled back — actually exercising the transaction.atomic() wrap. - Add coverage for the inspect.isabstract guard in __init_subclass__: an abstract intermediary may defer key/name/config_schema to its concrete subclasses. 24 post_processing tests green. Co-Authored-By: Claude <noreply@anthropic.com>

Make post-processing filters easier to develop against and observe. Per-occurrence trigger (the dev/spot path): SmallSizeFilterConfig now accepts either source_image_collection_id or occurrence_id (discriminated scope, exactly-one enforced by a pydantic root_validator), and the task resolves its detection queryset from whichever scope is set. OccurrenceAdmin gains a run_small_size_filter action via the same make_post_processing_action factory, so an operator can run a filter on a single selected occurrence without standing up a whole capture set. This discriminated-scope shape is the pattern future per-occurrence / per-event tasks (#999, #1272) copy. Job stage metrics (run observability): BasePostProcessingTask.report_stage_metrics writes named counters onto the job's post_processing stage (falling back to a log line when run without a job). The small size filter emits detections_checked, detections_flagged and occurrences_updated at each flush and on completion, so the Jobs admin page shows what a run examined and changed. Tests: schema scope cases (occurrence-only valid, both-scopes raise, no-scope raises); the OccurrenceAdmin action enqueues a job scoped to occurrence_id; run-time occurrence-scope isolation (sibling occurrences untouched); and stage-metric reporting on a real job. Co-Authored-By: Claude <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ami/ml/post_processing/small_size_filter.py`:
- Around line 149-153: The current batch flush loop uses the batch-local
occcurrences_to_update set and then clears it, which allows the same occurrence
to be counted multiple times across batches; introduce a persistent set (e.g.,
updated_occurrence_ids) at the scope of the processing function/class to record
IDs of occurrences already counted, then when iterating occcurrences_to_update
only call occ.save(...) and increment modified_occurrences / occurrences_updated
for occurrences whose id is not in updated_occurrence_ids, add their ids to
updated_occurrence_ids, and finally clear the batch set; apply the same change
to the other flush block that uses occcurrences_to_update so counts are
deduplicated across all batches.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4b26c680-56a7-47e7-b05e-ff240b54866c

📥 Commits

Reviewing files that changed from the base of the PR and between abcb446 and beacb3f.

📒 Files selected for processing (7)

ami/main/admin.py
ami/ml/post_processing/base.py
ami/ml/post_processing/small_size_filter.py
ami/ml/post_processing/tests/test_action_factory.py
ami/ml/post_processing/tests/test_base_schema.py
ami/ml/post_processing/tests/test_small_size_filter_admin.py
ami/ml/tests.py

🚧 Files skipped from review as they are similar to previous changes (2)

ami/ml/post_processing/base.py
ami/ml/post_processing/tests/test_action_factory.py

The metric incremented from a batch-local set that is cleared each flush, so an occurrence whose detections span more than one batch was counted once per batch. Track occurrence ids in a persistent set and report its size instead. Detections never recur across batches, so detections_flagged was already correct. Addresses CodeRabbit review on beacb3f. Co-Authored-By: Claude <noreply@anthropic.com>

…admin-scaffolding

Replace per-test setup_test_project calls with minimal Project/Collection/ Occurrence rows and class-level setUpTestData in the admin-action and factory tests; the admin flow only reads FKs, so the full fixture (storage source, deployment, processing service per call) was wasted cost. Convert the pure form tests to SimpleTestCase (no DB) and drop one redundant valid-path form test already covered end to end by the admin and factory flows. TestPostProcessingTasks now builds its images/events/collection once per class instead of once per test. Module wall-clock roughly halves locally (8.8s -> 4.5s warm), and the double setup_test_project in the multi-collection admin test that intermittently hit a statement timeout in CI is gone. Co-Authored-By: Claude <noreply@anthropic.com>

…admin-scaffolding

… select_across refusal The post-processing confirmation form rendered each field's validation errors with the page-level `errornote` banner style. Switch the form-fieldset partial to Django's native `{{ field.errors }}` and `{{ form.non_field_errors }}` so errors render inline as `errorlist`, matching Django admin conventions. The invalid-threshold test now asserts the inline `errorlist` class. Also reframe the "select all across pages" comment: the reason to refuse it is that the action would apply to the entire filtered table rather than the rows the operator explicitly selected. The oversized POST body is a secondary symptom, not the primary concern. Co-Authored-By: Claude <noreply@anthropic.com>

…admin-scaffolding

The small size filter reported `occurrences_updated` as the number of occurrences it re-saved, but re-saving recomputes the determination in place. An occurrence pinned to a human identification keeps its taxon when its detection is flagged "Not identifiable", so it was counted even though nothing about it actually changed. Capture each occurrence's determination before the save and count it only when the determination changes after. Add a regression test pairing an occurrence that carries a human identification (determination unchanged, not counted) with an un-identified one (determination flips, counted). Co-Authored-By: Claude <noreply@anthropic.com>

…b(s) After an operator runs a post-processing task from the admin, the success message now links each created Job to its admin change page instead of just printing the raw pks. From there the operator can follow the job's progress and read any failure reason, which is otherwise only visible in the worker log. The link points at the admin change page because that page is always reachable from this admin action; the public UI host is not reliably known in this context. The message is built with format_html so the links render as anchors. Co-Authored-By: Claude <noreply@anthropic.com>

mihow · 2026-06-25T19:18:23Z

Claude says: Two small fixes landed on top of the main-merge (5e2438bb, b3577df7); branch is up to date with main.

occurrences_updated now counts only occurrences whose determination actually changed. The small size filter re-saves each affected occurrence, which recomputes the determination in place — an occurrence pinned to a human identification keeps its taxon, so it was previously counted even when nothing changed. Regression test added.
The admin action's success message now links each created Job to its admin change page, so an operator can follow progress and read any failure reason there rather than only in the worker log.

The framework was also exercised end-to-end on production-shaped data on a dev box: the admin → Job → Celery path drives small_size_filter and class_masking (#999) on real captures — including the human-ID-precedence case, the excluded-class-can't-win case, and the all-excluded guard (clean job failure, no side effects). 31 post-processing tests pass locally; makemigrations --check, lint, and CodeRabbit are green.

Ready for a final look — blocked only on an approve.

…dmin framework Rework of #999 onto the #1289 post-processing framework. The branch was cut from an old main with its own hand-rolled admin action; ClassMaskingTask and RankRollupTask now subclass BasePostProcessingTask with pydantic config schemas and are triggered through make_post_processing_action (collection scope on SourceImageCollection, single-occurrence scope on Occurrence for class masking). Correctness fixes from the review threads: - Class masking selects the top class from an -inf-masked softmax, so a class excluded by the taxa list can never win even when it had the highest logit; raises when the taxa list excludes every class in the category map. Stored logits stay raw (JSON-safe) and the mask is captured in scores (excluded -> 0). - The masked-output Algorithm is one per (source algorithm, taxa list) and its category map is persisted (previously set in memory only, so masked classifications referenced a null map). - applied_to is populated on new masked classifications (the provenance the API exposes was left blank). - Rank rollup preloads category-map labels in two queries and select_relates the per-row relations instead of dereferencing category_map per classification. Surfaces provenance in the API: applied_to is added to the Classification serializers, and applied_to__algorithm is prefetched in the occurrence list/detail prefetch and the classification viewset to avoid an N+1 on render. Tests: pydantic config validation, admin trigger for both scopes, the masking maths (including the excluded-class guarantee and the all-excluded error), ClassMaskingTask.run() end to end for both scopes, and rank rollup. 20 new tests; full post_processing suite and occurrence query-count tests pass. Co-Authored-By: Claude <noreply@anthropic.com>

The occurrence admin changelist is the entry point for the per-occurrence post-processing trigger, but on a large table (~1.3M occurrences) it took ~15s to load a single page. Two causes, both fixed here: - get_queryset counted detections with a JOIN + GROUP BY. A grouped count must aggregate the whole occurrence x detection join before the changelist's ORDER BY ... LIMIT can take a page, so it scanned every row just to show 25. Replaced with a correlated subquery (Coalesced to 0 for occurrences with no detections, matching the old JOIN count), which runs only for the rows on the page. Measured 14.9s -> 0.04s for one page on the large table. - The list ordered by -created_at, which has no index, forcing a full sort of the table to find the newest page. Switched to -id, the indexed primary key, which increases with insertion time so newest-first is preserved. Also let an all-digit search term act as an exact occurrence-id lookup, so an operator can jump straight to an occurrence by id; non-numeric terms still use the determination-name search. Co-Authored-By: Claude <noreply@anthropic.com>

…dmin framework Rework of #999 onto the #1289 post-processing framework. The branch was cut from an old main with its own hand-rolled admin action; ClassMaskingTask and RankRollupTask now subclass BasePostProcessingTask with pydantic config schemas and are triggered through make_post_processing_action (collection scope on SourceImageCollection, single-occurrence scope on Occurrence for class masking). Correctness fixes from the review threads: - Class masking selects the top class from an -inf-masked softmax, so a class excluded by the taxa list can never win even when it had the highest logit; raises when the taxa list excludes every class in the category map. Stored logits stay raw (JSON-safe) and the mask is captured in scores (excluded -> 0). - The masked-output Algorithm is one per (source algorithm, taxa list) and its category map is persisted (previously set in memory only, so masked classifications referenced a null map). - applied_to is populated on new masked classifications (the provenance the API exposes was left blank). - Rank rollup preloads category-map labels in two queries and select_relates the per-row relations instead of dereferencing category_map per classification. Surfaces provenance in the API: applied_to is added to the Classification serializers, and applied_to__algorithm is prefetched in the occurrence list/detail prefetch and the classification viewset to avoid an N+1 on render. Tests: pydantic config validation, admin trigger for both scopes, the masking maths (including the excluded-class guarantee and the all-excluded error), ClassMaskingTask.run() end to end for both scopes, and rank rollup. 20 new tests; full post_processing suite and occurrence query-count tests pass. Co-Authored-By: Claude <noreply@anthropic.com>

…om occurrences The detection admin changelist counted classifications with Count() + GROUP BY and ordered by an unindexed -created_at. On a large table (~1.4M detections, ~2.2M classifications) the grouped aggregate runs over the whole detection x classification join before the page LIMIT applies — slow enough to exhaust work_mem and error out. Replaced with a correlated subquery (Coalesced to 0) and -id ordering, mirroring the occurrence changelist fix. Also set show_change_link on the detection inline of the occurrence page, so an operator can open a detection's change page — where the classifications inline shows which algorithms were applied, including post-processing — without searching for its id. Co-Authored-By: Claude <noreply@anthropic.com>

…dmin framework Rework of #999 onto the #1289 post-processing framework. The branch was cut from an old main with its own hand-rolled admin action; ClassMaskingTask and RankRollupTask now subclass BasePostProcessingTask with pydantic config schemas and are triggered through make_post_processing_action (collection scope on SourceImageCollection, single-occurrence scope on Occurrence for class masking). Correctness fixes from the review threads: - Class masking selects the top class from an -inf-masked softmax, so a class excluded by the taxa list can never win even when it had the highest logit; raises when the taxa list excludes every class in the category map. Stored logits stay raw (JSON-safe) and the mask is captured in scores (excluded -> 0). - The masked-output Algorithm is one per (source algorithm, taxa list) and its category map is persisted (previously set in memory only, so masked classifications referenced a null map). - applied_to is populated on new masked classifications (the provenance the API exposes was left blank). - Rank rollup preloads category-map labels in two queries and select_relates the per-row relations instead of dereferencing category_map per classification. Surfaces provenance in the API: applied_to is added to the Classification serializers, and applied_to__algorithm is prefetched in the occurrence list/detail prefetch and the classification viewset to avoid an N+1 on render. Tests: pydantic config validation, admin trigger for both scopes, the masking maths (including the excluded-class guarantee and the all-excluded error), ClassMaskingTask.run() end to end for both scopes, and rank rollup. 20 new tests; full post_processing suite and occurrence query-count tests pass. Co-Authored-By: Claude <noreply@anthropic.com>

Add a search box to the detection changelist: an all-digit term is an exact id lookup (jump straight to a detection by id), other terms search the source image path. Mirrors the occurrence changelist search. Co-Authored-By: Claude <noreply@anthropic.com>

…dmin framework Rework of #999 onto the #1289 post-processing framework. The branch was cut from an old main with its own hand-rolled admin action; ClassMaskingTask and RankRollupTask now subclass BasePostProcessingTask with pydantic config schemas and are triggered through make_post_processing_action (collection scope on SourceImageCollection, single-occurrence scope on Occurrence for class masking). Correctness fixes from the review threads: - Class masking selects the top class from an -inf-masked softmax, so a class excluded by the taxa list can never win even when it had the highest logit; raises when the taxa list excludes every class in the category map. Stored logits stay raw (JSON-safe) and the mask is captured in scores (excluded -> 0). - The masked-output Algorithm is one per (source algorithm, taxa list) and its category map is persisted (previously set in memory only, so masked classifications referenced a null map). - applied_to is populated on new masked classifications (the provenance the API exposes was left blank). - Rank rollup preloads category-map labels in two queries and select_relates the per-row relations instead of dereferencing category_map per classification. Surfaces provenance in the API: applied_to is added to the Classification serializers, and applied_to__algorithm is prefetched in the occurrence list/detail prefetch and the classification viewset to avoid an N+1 on render. Tests: pydantic config validation, admin trigger for both scopes, the masking maths (including the excluded-class guarantee and the all-excluded error), ClassMaskingTask.run() end to end for both scopes, and rank rollup. 20 new tests; full post_processing suite and occurrence query-count tests pass. Co-Authored-By: Claude <noreply@anthropic.com>

…ation More admin niceties for reviewing post-processing results on a large database: - Classification change page: FK fields (taxon, detection, algorithm, category_map, applied_to) render as autocomplete widgets instead of <select>s preloaded with every row, which made the page unusable. - Classification changelist: count the scores/logits arrays in SQL (cardinality) and defer the arrays, so the list no longer transfers thousands of floats per row; order by -id; add an id / taxon-name search box. - Link the detection page's classification inline rows to their change page. - New OccurrenceAdmin action "Recompute determination": editing classifications by hand doesn't refresh the occurrence determination (only Occurrence and Identification saves do), so this re-derives it for the selected occurrences. Co-Authored-By: Claude <noreply@anthropic.com>

…dmin framework Rework of #999 onto the #1289 post-processing framework. The branch was cut from an old main with its own hand-rolled admin action; ClassMaskingTask and RankRollupTask now subclass BasePostProcessingTask with pydantic config schemas and are triggered through make_post_processing_action (collection scope on SourceImageCollection, single-occurrence scope on Occurrence for class masking). Correctness fixes from the review threads: - Class masking selects the top class from an -inf-masked softmax, so a class excluded by the taxa list can never win even when it had the highest logit; raises when the taxa list excludes every class in the category map. Stored logits stay raw (JSON-safe) and the mask is captured in scores (excluded -> 0). - The masked-output Algorithm is one per (source algorithm, taxa list) and its category map is persisted (previously set in memory only, so masked classifications referenced a null map). - applied_to is populated on new masked classifications (the provenance the API exposes was left blank). - Rank rollup preloads category-map labels in two queries and select_relates the per-row relations instead of dereferencing category_map per classification. Surfaces provenance in the API: applied_to is added to the Classification serializers, and applied_to__algorithm is prefetched in the occurrence list/detail prefetch and the classification viewset to avoid an N+1 on render. Tests: pydantic config validation, admin trigger for both scopes, the masking maths (including the excluded-class guarantee and the all-excluded error), ClassMaskingTask.run() end to end for both scopes, and rank rollup. 20 new tests; full post_processing suite and occurrence query-count tests pass. Co-Authored-By: Claude <noreply@anthropic.com>

… count, guard huge ids Three review follow-ups on the admin niceties added for reviewing post-processing: - Extract the duplicated digit-term id search from the Occurrence, Detection, and Classification admins into a shared IdSearchAdminMixin. The three copies were identical; the mixin is now the single place that behavior lives. - Guard an out-of-range numeric term. A digit string larger than the bigint primary key raised a database DataError (500) before; it now returns no results. - Set show_full_result_count = False on the three large-table admins, so the changelist no longer runs a second, unfiltered COUNT(*) over the whole table for its footer total -- as expensive as the page query itself on millions of rows. Tests: add the out-of-range id case. The shared mixin is now tested once in OccurrenceAdminChangelistTest, so the redundant per-admin copy on the Detection admin is removed (its docstring points to the canonical test). Co-Authored-By: Claude <noreply@anthropic.com>

…dmin framework Rework of #999 onto the #1289 post-processing framework. The branch was cut from an old main with its own hand-rolled admin action; ClassMaskingTask and RankRollupTask now subclass BasePostProcessingTask with pydantic config schemas and are triggered through make_post_processing_action (collection scope on SourceImageCollection, single-occurrence scope on Occurrence for class masking). Correctness fixes from the review threads: - Class masking selects the top class from an -inf-masked softmax, so a class excluded by the taxa list can never win even when it had the highest logit; raises when the taxa list excludes every class in the category map. Stored logits stay raw (JSON-safe) and the mask is captured in scores (excluded -> 0). - The masked-output Algorithm is one per (source algorithm, taxa list) and its category map is persisted (previously set in memory only, so masked classifications referenced a null map). - applied_to is populated on new masked classifications (the provenance the API exposes was left blank). - Rank rollup preloads category-map labels in two queries and select_relates the per-row relations instead of dereferencing category_map per classification. Surfaces provenance in the API: applied_to is added to the Classification serializers, and applied_to__algorithm is prefetched in the occurrence list/detail prefetch and the classification viewset to avoid an N+1 on render. Tests: pydantic config validation, admin trigger for both scopes, the masking maths (including the excluded-class guarantee and the all-excluded error), ClassMaskingTask.run() end to end for both scopes, and rank rollup. 20 new tests; full post_processing suite and occurrence query-count tests pass. Co-Authored-By: Claude <noreply@anthropic.com>

mihow and others added 2 commits May 1, 2026 15:13

docs(post-processing): add PR #1289 admin smoke screenshots

a1c8600

Confirmation page (intermediate page rendering with size_threshold form field) and success message (after submitting form, Job 1571 enqueued with typed config payload). Co-Authored-By: Claude <noreply@anthropic.com>

mihow commented May 11, 2026

View reviewed changes

Comment thread ami/main/admin.py Outdated

mihow and others added 2 commits June 4, 2026 16:35

Merge branch 'main' into feat/post-processing-admin-scaffolding

59a7d4b

chore: host PR screenshots on S3 instead of committing to repo

93bd529

Screenshots for the PR description now live in the ami-media-staging object store rather than docs/claude/screenshots/. Keeps binary assets out of the repo history. Co-Authored-By: Claude <noreply@anthropic.com>

mihow changed the title ~~feat(post-processing): admin scaffolding precursor (pydantic schema, form base, parameterized template)~~ Let admins set the size threshold when running the small size filter Jun 4, 2026

mihow changed the title ~~Let admins set the size threshold when running the small size filter~~ Trigger, configure, and track post-processing filters from the Django admin Jun 4, 2026

mihow changed the title ~~Trigger, configure, and track post-processing filters from the Django admin~~ Framework for admins to trigger and review post-processing methods Jun 5, 2026

mihow marked this pull request as ready for review June 5, 2026 00:08

Copilot AI review requested due to automatic review settings June 5, 2026 00:08

Copilot started reviewing on behalf of mihow June 5, 2026 00:08 View session

Copilot AI reviewed Jun 5, 2026

View reviewed changes

Comment thread ami/main/admin.py Outdated

Comment thread ami/ml/post_processing/tests/test_small_size_filter_admin.py

Comment thread ami/templates/admin/post_processing/confirmation.html

coderabbitai Bot reviewed Jun 5, 2026

View reviewed changes

Comment thread ami/ml/post_processing/admin/small_size_filter_form.py Outdated

Comment thread ami/ml/post_processing/tests/test_small_size_filter_admin.py Outdated

Comment thread docs/claude/planning/2026-05-01-post-processing-admin-scaffolding-design.md Outdated

This was referenced Jun 5, 2026

feat(tracking): revive occurrence tracking as a post-processing task #1272

Open

Let admins mask classifier predictions to a species list #999

Open

mihow and others added 2 commits June 5, 2026 16:52

coderabbitai Bot reviewed Jun 8, 2026

View reviewed changes

Comment thread ami/ml/post_processing/small_size_filter.py

mihow and others added 2 commits June 8, 2026 14:36

Merge remote-tracking branch 'origin/main' into feat/post-processing-…

b0e50fd

…admin-scaffolding

mihow and others added 3 commits June 9, 2026 17:08

Merge remote-tracking branch 'origin/main' into feat/post-processing-…

8ba683a

…admin-scaffolding

mihow mentioned this pull request Jun 23, 2026

Validate the scope object belongs to the job's project in post-processing tasks #1345

Open

mihow added the post-processing Post-processing task framework and tasks: size filter, class masking, rank rollup, tracking label Jun 23, 2026

mihow and others added 3 commits June 22, 2026 19:37

Merge remote-tracking branch 'origin/main' into feat/post-processing-…

b85e32c

…admin-scaffolding

mihow merged commit 08ca0a4 into main Jun 26, 2026
7 checks passed

mihow deleted the feat/post-processing-admin-scaffolding branch June 26, 2026 01:17

This was referenced Jun 30, 2026

Measure the effect of post-processing filters on final predictions (offline eval) #1360

Open

Let admins roll up low-confidence predictions to a coarser rank #1361

Draft

Uh oh!

Conversation

mihow commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

List of Changes

Admin niceties for reviewing post-processing

Adding a new post-processing filter

Out of scope (deferred to follow-ups)

Rebase impact for downstream PRs

Smoke test (manual, against live stack)

Test plan

Design doc

Notes

Summary by CodeRabbit

Uh oh!

netlify Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for antenna-preview canceled.

Uh oh!

netlify Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for antenna-ssec canceled.

Uh oh!

coderabbitai Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mihow commented Jun 5, 2026

Uh oh!

coderabbitai Bot commented Jun 5, 2026

Review of PR #1289 — Post-Processing Admin Scaffolding

🔴 Important — default_build_jobs is not atomic

🟡 Medium — Pydantic v1 API with no version pin

🟡 Medium — __init_subclass__ blocks abstract intermediary classes

🟡 Medium — Two DB hits in render_confirmation

🟡 Medium — Custom build_jobs receives scope_resolver=None

🟡 Medium — TestBuildJobsOverrideHook test doesn't actually invoke the action

🟢 Low — name capitalisation inconsistency

✅ What's working well

Uh oh!

mihow commented Jun 5, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mihow commented Jun 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mihow commented May 1, 2026 •

edited

Loading

netlify Bot commented May 1, 2026 •

edited

Loading

netlify Bot commented May 1, 2026 •

edited

Loading

coderabbitai Bot commented May 1, 2026 •

edited

Loading

Review of PR `#1289` — Post-Processing Admin Scaffolding

🔴 Important — `default_build_jobs` is not atomic

🟡 Medium — `__init_subclass__` blocks abstract intermediary classes

🟡 Medium — Two DB hits in `render_confirmation`

🟡 Medium — Custom `build_jobs` receives `scope_resolver=None`

🟡 Medium — `TestBuildJobsOverrideHook` test doesn't actually invoke the action

🟢 Low — `name` capitalisation inconsistency