[3/3] Add make_stitched_labels (materialise stitched labels)#1207
Conversation
Lock the validation-sweep outcome: at min_confidence=0.5 the deterministic fixture recovers >=50% of cut pieces with no intact false-merges. min_confidence default stays 0.7 (full attainable recall, zero false merges); gap_proximity kept in the 5-feature score. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…bels) PR-C of the scverse#1170 split. Adds sq.experimental.im.make_stitched_labels: reads the stitch_group_id mapping written by assign_stitch_groups and registers a new labels element where each stitched group shares one ID (original labels left untouched), plus an optional collapsed AnnData (one row per group) with configurable merge_strategy. Stacks on PR-B (feature/tiling-stitch-algo) and reuses its full-pipeline tests. Scale rework (must handle 100k x 100k, never materialise the full array): - LUT remap stays lazy; labels present in the image but absent from the QC table (e.g. min_area-filtered cells) pass through instead of indexing OOB. - join_labels is now chunk-aware via dask map_overlap (depth = close_radius+2), never computing the whole image or running a full-frame regionprops. - group aggregation vectorised (argsort+split, pandas groupby) instead of the O(cells x groups) per-group np.where scan. - X aggregation is sparse-safe: sum/mean via sparse matmul, first via sparse gather, other reducers per-group-bounded; the full matrix is never densified. - validate label_id/is_stitched up front; squeeze a leading singleton dim. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…our change - inline equivalent_diameter + largest_contour (single consumer _tiling_stitch) into their call site and delete utils/_geometry.py; keep both guards (1px pad precondition + empty-contour skip). - route _resolve_qc_params through the shared resolve_params helper instead of a duplicated Mapping-validation block; drop the now-dead _QC_FIELDS and unused fields import. utils/_params.py now has two genuine callers. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…le + validation Addresses a max-effort review of the stitched-labels path: - collapsed table no longer sums QC-contract columns: centroid_y/x take the mean of the pieces (a merged cell's position), the per-piece cut-artifact scores take the max (worst piece); merge_strategy now applies only to genuine additive user features. Drops the dtype-restore that re-narrowed/truncated promoted sums and means. - _aggregate_X: sum no longer treated as integer-preserving, so an integer .X promotes to float64 instead of wrapping (uint16 200000 -> 3392); a callable strategy is now applied to singleton groups too (obs/X parity). - multiscale: parse the new element with the resolved scale's transform (not the DataTree base/Identity), so the output overlays correctly; documented as a single-scale element at the QC scale's resolution. - validate label_id (positive, unique, non-NaN) and merge_strategy up front, so bad input fails fast even with write_table=False. - preserve varm (var axis is unchanged); correct the join_labels docstring (it is lazy via map_overlap and bridges seams up to 2*close_radius px, not "forces materialisation"). - add TestReviewFixes regression locks (each fails on the pre-fix code). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1207 +/- ##
==========================================
+ Coverage 76.53% 76.81% +0.27%
==========================================
Files 63 63
Lines 9070 9267 +197
Branches 1523 1565 +42
==========================================
+ Hits 6942 7118 +176
- Misses 1544 1547 +3
- Partials 584 602 +18
🚀 New features to boost your workflow:
|
| contour = largest_contour(mask) | ||
| if contour is None: | ||
| contours = find_contours(mask, 0.5) | ||
| if not contours: # degenerate mask traces nothing; skip it |
There was a problem hiding this comment.
if not contours is ambigious if it means it's 0 the check should be to if contours != 0 or something like this.
There was a problem hiding this comment.
if contours != 0 wouldn't work here because upon failure, it'd return an empty list
There was a problem hiding this comment.
yeah then len(contours) == 0 then right? that's what I mean by being unclear
| Key for the new labels element. Defaults to | ||
| ``"{labels_key}_stitched"``. Existing element at this key is | ||
| overwritten with a warning. | ||
| table_key_added |
There was a problem hiding this comment.
I might be wrong on this, but can't we notice if table_key_added is not None, write_table would be automatically true? I think it's a default at the price of adding a bool argument.
There was a problem hiding this comment.
Almost 🤔 We have these states:
A) write_table=True + table_key_added=None --> write table with default key ({out_key}_table)
B) write_table=True + table_key_added="foo" --> write table with key (foo)
C) write_table=False + table_key_added=None --> no table
D) write_table=False + table_key_added="foo" --> error
For that granularity of "I don't care about the name, just use defaults", we need to enable the permutations
There was a problem hiding this comment.
yeah thats what I mean, we add one argument write_table for just supporting: "I don't care about the name, just use defaults". This is a preference thing so it's fine I guess
selmanozleyen
left a comment
There was a problem hiding this comment.
Some nitpicks but it seems quite good!
- enforce table_key_added != labels element key before any sdata mutation (SpatialData names are unique across element types; avoids a half-written sdata and a cryptic late KeyError) - reject the contradictory write_table=False + table_key_added combo instead of silently ignoring the key - replace dead pytest.skip guards with asserts (sdata_tile_boundary is seeded, so stitched cells are a deterministic precondition) - add regression cases for both new validation errors - drop the make_stitched_labels release note (handled at release time)
|
yey! Will send a tutorial to squidpy_notebooks next |
Completes the tile-cut stitching arc (
[1/3]#1188,[2/3]#1193) with the materialisation step.Usage
Knobs:
merge_strategy(how additive user features aggregate across pieces; default"sum"),join_labels(close the seam so a group is one connected component; stays lazy on dask),write_table,inplace.Also in this PR
utils/_geometry.pyhelpers into their one caller (module deleted) and routed_resolve_qc_paramsthrough the sharedresolve_params. No behaviour change.Closes the #1170 arc.