Add an Example column to the taxa list to verify species presence one row at a time#1365
Add an Example column to the taxa list to verify species presence one row at a time#1365mihow wants to merge 4 commits into
Conversation
…fication Add the backend for issue #1320. The taxa list can now return, per taxon, one example occurrence to verify — the best-scoring unverified occurrence for unverified rows (fastest clean ID) and the latest occurrence for already-verified rows (is it still showing up?) — plus the source occurrence ids behind the Last-seen and Best-score cells so the frontend can deep-link each to the identification modal. The whole selection is gated behind ?with_example_occurrences=true so the default list keeps its latency budget, especially on the ?collection= (detections-join) path. When off, the three fields serialize as null. - TaxonQuerySet.with_example_occurrence_ids: three correlated subqueries (index-served on the default path), with the verified-vs-unverified branch chosen by a precomputed verified-taxon set. - Extract verified_taxon_counts() so the verified rollup is computed once and shared by with_verification_counts and the example dispatch. - TaxonViewSet hydrates the chosen example ids into {id, detection_id, image_url, score, verified} in one query per page (no N+1). - TaxonPagination.get_count strips annotations before the COUNT (mirrors ProjectPagination) so the example subqueries are not evaluated for every taxon in the project via TagInverseFilter's .distinct(). Co-Authored-By: Claude <noreply@anthropic.com>
… list Frontend for issue #1320. Each taxon row gains a non-sortable Example column showing a thumbnail of the occurrence to verify; clicking it opens the existing occurrence identification modal (Agree / Suggest ID) over the taxa list, so a user can sweep unverified taxa and confirm presence one row at a time. The Last-seen and Best-score cells link to the same modal for their source occurrence. - Extract OccurrenceDetailsDialog from the occurrences page and reuse it on the taxa list, keyed off a ?verifyOccurrence= search param so the list stays behind. - Species model: verificationExample / bestScoringOccurrenceId / lastDetectedOccurrenceId getters over the new API fields; useSpecies requests ?with_example_occurrences=true. - Dim already-verified rows (new optional rowClassName hook on the Table) and mark them with a verified icon. - Invalidate the taxa list query after an identification so verified counts and the example thumbnail refresh without a reload. Co-Authored-By: Claude <noreply@anthropic.com>
…taxa list Entering the occurrence modal from the taxa list is a verification action, so it now opens on the Identification tab (Agree / Suggest ID) instead of Fields. Added an optional defaultTab prop to OccurrenceDetailsDialog; the occurrences list keeps its Fields default. Co-Authored-By: Claude <noreply@anthropic.com>
✅ Deploy Preview for antenna-ssec canceled.
|
✅ Deploy Preview for antenna-preview canceled.
|
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Note Unit test generation is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. Generating unit tests... This may take up to 20 minutes. |
|
Request timed out after 900000ms (requestId=af2a3ee8-73a5-45e3-bdbd-af9b43319e2a) |
Follow-up to the presence-verification Example column (#1320), applying the structural fixes surfaced by a takeaway review. Gate draft-project data behind visibility. The taxa list annotates observed- occurrence data — per-taxon counts and, under ?with_example_occurrences, example occurrence ids plus detection crop URLs — via subqueries that are not visibility- gated, so a non-member could read it for a draft project. TaxonViewSet.get_queryset now refuses a project the user cannot see with a 404, matching the sibling project-scoped taxa endpoints (top-identifiers, model-agreement). Single-source the nested example shape. ExampleOccurrenceSerializer is now the one declaration of {id, detection_id, image_url, score, verified}; the view hydrates a page of occurrences and serializes them through it instead of hand-building a dict, and it types the field for the OpenAPI schema. Remove the duplicate best-detection subquery. with_best_detection() now also annotates best_detection_id, picked from the same detection as best_detection_path, so the example's detection_id and its image can no longer drift apart. Tests: a higher-rank taxon used directly for identifications now has a pinned example (only pure roll-up ancestors are NULL); the collection path asserts the example is drawn from the same set occurrences_count reports; a draft project hides examples from a non-member while a member still sees them. Co-Authored-By: Claude <noreply@anthropic.com>
Summary
This adds a presence-verification workflow to the taxa list. Curators reviewing a project's species often need to answer a simple question one taxon at a time: is this really here? Today that means leaving the list, hunting down an occurrence, and verifying it in a separate view. This change surfaces one representative occurrence per taxon directly in the list as an Example thumbnail. Clicking it opens the existing occurrence identification modal right over the list, on the Identification tab, so the reviewer can confirm the taxon and move to the next row without losing their place. The Last-seen and Best-score cells link into the same modal, verified rows are dimmed and marked, and confirming an identification updates the row in place.
The whole feature is opt-in behind a query parameter, so the default taxa list keeps its current latency budget. When the parameter is off, the new fields serialize as
nulland no extra queries run.Closes #1320.
List of Changes
OccurrenceDetailsDialog(route-driven via?verifyOccurrence=<id>) rather than forked. Backend returns a nestedexample_occurrence {id, detection_id, image_url, score, verified}.best_scoring_occurrence_idandlast_detected_occurrence_id; the cells render as links to?verifyOccurrence=.?with_example_occurrences=true; off by default they annotateNULLand run no subqueries.TaxonQuerySet.with_example_occurrence_idspicks between three correlated subqueries using a precomputed verified-taxon set.TaxonPagination.get_countstrips annotations before the COUNT so the example subqueries are never pulled into the count query by the tag filter's.distinct().verified_taxon_counts()was extracted so both the verified-count annotation and the example dispatch share a single pass.Hardening from a takeaway review (commit
fd5ac069)A structural review after the first pass surfaced three things, now fixed:
TaxonViewSet.get_querysetnow refuses a project the requester cannot see with a 404, the same way the other project-scoped taxa endpoints already do. (The counts were exposed before this feature; the example adds object ids and image URLs, so the gate matters more now.){id, detection_id, image_url, score, verified}object is now defined once in anExampleOccurrenceSerializer(also typing the field for the OpenAPI schema) instead of being hand-built as a dict in the view.with_best_detection()now also returnsbest_detection_id, chosen from the same detection as the image path, so the returneddetection_idcannot point at a different detection than the thumbnail.New tests cover each: a non-member is refused a draft project's taxa list while a member still sees examples; a higher-rank taxon used directly for identifications gets a pinned example; and the
?collection=path draws the example from the same set the count reports.Detailed Description
Selection semantics (hybrid, exact-determination)
verified_countrolls up to ancestors, so higher-rank rows (genus, family) get aNULLexample — the example is exact-determination, not rolled up.One gotcha worth recording: verifying an occurrence overrides its
determination_score, so verified occurrences tie on score. The test fixture accounts for this by marking a taxon verified through a separate occurrence rather than relying on score ordering.What is gated behind query parameters
Opt-in (default off) — this feature's cost gate:
?with_example_occurrences=trueenablesexample_occurrence,best_scoring_occurrence_id,last_detected_occurrence_id. Off: all three annotateValue(None)— no subqueries, no hydration query, stable response shape. On: three subqueries fold into the pageSELECTplus one hydration query per page. Parsed strictly (?with_example_occurrences=abc→ 400).Dispatch parameters (change which query shape runs, pre-existing):
?collection=<id>switches the observation counts from correlated subqueries to conditional aggregation over the detections join. This is why the example subqueries are opt-in: they degrade to per-row scans on this path.?include_unobserved=truedrops the observed-only restriction.?verified=true|falsefilters rows to the verified / unverified set.?apply_defaults=falsebypasses the default score-threshold and taxa include/exclude filters.Always-on:
occurrences_count,events_count,last_detected,best_determination_score,verified_count.Performance
Measured on the real DRF path (queryset + pagination COUNT + example hydration + full serialization), 25 taxa per page, cold (query cache flushed) vs warm, on three large projects. Projects are anonymised; sizes are what matter.
Findings:
SELECT; the extra query is the single per-page hydration. Query count does not scale with page size or offset.verified_taxon_counts) alone accounts for ~743 ms on that project versus ~25 ms on the others — a Python pass whose cost grows with the number of verified taxa. That rollup predates this change; this feature adds only ~12 ms on top of it. It is called out here as a separate, pre-existing hotspot, not introduced by this PR. These are wall-clock measurements from a long-running local stack (PostgreSQL shared buffers warm); "cold" isolates the query-cache miss.Testing
NULLcase, deployment scoping, the?collection=fan-out, and the 400-on-bad-flag case; plus query-count tests asserting the hydration does not scale with page size and that the example subqueries are stripped from the pagination COUNT. Broader taxa regression suite passes.makemigrations --checkis clean (query-only change, no migration).tsc/ eslint for the frontend).Possible follow-up (out of scope)
The always-on verification rollup goes O(verified occurrences) in a Python pass. On verification-heavy projects that is the dominant cost of the taxa list, independent of this feature. A denormalised per-(project, taxon) aggregate refreshed on the cached-count pattern would remove it. Worth a separate ticket.