feat(accelerator): numba costes colocalization by timtreis · Pull Request #62 · afermg/cp_measure

timtreis · 2026-06-03T22:03:01Z

Stacked on #60. The 5th and final colocalization feature.

Per-object port of get_correlation_costes + bisection_costes/linear_costes into core/numba/_costes.py::costes_per_object, reusing #60's grouped layout (labels_to_offsets + flatten_pairs_grouped) — no sort. All three fast_costes modes (M_FASTER bisection, M_FAST/M_ACCURATE linear), control flow bit-reproduced. The reference's dead calculate_threshold is skipped; thr accepted for parity but unused.

Pearson-on-subset matches scipy.stats.pearsonr's op order; error_model="numpy" so a constant subset → NaN (not ZeroDivisionError), matching scipy.

Exact vs numpy on float pixels (scale=1); integer-dtype diverges by design (reference overflows z = fi + si). bzyx via to_bzyx-twice. Speedup 41.8× (1080², 144 obj).

Tests: kernel control-flow vs the real reference search at scale=255 (exercises the multi-iteration path), pearson vs scipy, regression vs reference, end-to-end golden 2D/3D/batch × 3 modes. Full suite 145 passed, lint clean. Stack: #59 → #60 → this.

Per-object port of get_correlation_costes + bisection_costes/linear_costes into core/numba/_costes.py::costes_per_object, reusing the #60 grouped layout (labels_to_offsets + flatten_pairs_grouped) — no sort. All three fast_costes modes: M_FASTER (bisection), M_FAST/M_ACCURATE (linear), control flow bit-reproduced (window math, num_true recompute cache, > vs >= threshold asymmetry). The reference's dead calculate_threshold call is skipped; thr is accepted for parity but unused. Pearson-on-subset matches scipy.stats.pearsonr's order (centre, normalise each vector, accumulate, clamp). error_model="numpy" so a constant subset yields NaN (not ZeroDivisionError), matching scipy's ConstantInputWarning -> nan. Exact vs numpy on float pixels (scale=1); integer-dtype diverges by design (the reference overflows z = fi + si in uint8/uint16). bzyx via to_bzyx-twice like the other coloc features. Speedup 41.8x (1080^2, 144 obj, float). Tests: kernel control-flow vs the real reference search at scale=255 (exercises the multi-iteration path), pearson vs scipy, regression vs reference, end-to-end golden 2D/3D/batch x 3 modes. Full suite 145 passed, lint clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- Extract _flatten_image() (mask-contiguity + labels_to_offsets + flatten_pairs_grouped), shared by _run and the costes runner — removes the duplicated per-image prep chain. - Drop the dead any_fi/any_si flags in costes_per_object: tot_* is read only when n_comb > 0, which already guarantees a pixel strictly above each threshold, so the reference's any(>thr) guard is always true there. Behaviour-preserving; 57 coloc/costes tests green, lint clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

bisection_costes called _count_combt then _pearson_combt for each visited threshold, but _pearson_combt's first pass recomputes the exact same subset count. Fuse them into _count_pearson_combt -> (cnt, r): one count pass, and the Pearson passes only when cnt > 2 (else r is nan and unused). Bit-identical to the previous kernel (same predicate, same accumulation order; verified array_equal across correlated AND anti-correlated objects, i.e. both search directions). ~9% faster (23.0 -> 20.9 ms, 1080^2/144 obj). 31 costes tests green. (linear/accurate modes keep _count_combt + the num_true cache.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

timtreis · 2026-06-04T04:48:26Z

Follow-up perf (commit 8b4ee9e): the bisection search called _count_combt then _pearson_combt for each visited threshold, but _pearson_combt's first pass recomputes the exact same subset count. Fused into _count_pearson_combt -> (cnt, r): one count pass, Pearson passes only when cnt > 2. Bit-identical (verified array_equal across correlated AND anti-correlated objects — both search directions), ~9% faster (23.0→20.9 ms, 1080²/144 obj). 31 costes tests green. (linear/accurate modes keep the separate count + num_true cache.)

…y features The five public get_correlation_* functions each re-ran the whole fused coloc_per_object kernel (+ flatten), so computing several coloc features paid ~5x redundant work and the numba backend lost to merged numpy on manders/overlap/rwc at large. Add get_correlation_all(p1, p2, masks, features=None): one flatten + one kernel pass returns the requested coloc groups (None = all). The cheap block (Pearson+slope, Manders, Overlap, K) is one pass; RWC's rank sort and the Costes kernel are gated to the requested set. Stateless — fusion happens by requesting the set in ONE call, not via any cache. It's the efficient entry point for any caller (not only featurize). The five single-feature functions become thin gated wrappers over it (single source; each now computes only its tier, so even one direct call is minimal). The numba correlation registry KEEPS per-group keys so the featurizer's per-group selection (_collect_correlation_features) keeps working — an earlier single-entry registry broke featurize with KeyError 'pearson'. large 1080^2/142obj: all-coloc 38ms (fused) vs 103ms (5 separate) vs 140ms (numpy) = 2.7x / 3.7x. Tests: subset/gating, bit-identity vs the wrappers, empty/batch/3D, unknown-group error, and featurize-runs-under-numba. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

timtreis and others added 2 commits June 4, 2026 00:02

timtreis changed the title ~~feat(accelerator): numba costes colocalization (all 3 modes)~~ feat(accelerator): numba costes colocalization Jun 3, 2026

timtreis force-pushed the feat/numba-coloc-costes branch from 29dbe9e to b0fe736 Compare June 6, 2026 23:28

timtreis added the numba label Jun 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(accelerator): numba costes colocalization#62

feat(accelerator): numba costes colocalization#62
timtreis wants to merge 4 commits into
feat/numba-colocfrom
feat/numba-coloc-costes

timtreis commented Jun 3, 2026

Uh oh!

timtreis commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

timtreis commented Jun 3, 2026

Uh oh!

timtreis commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant