Skip to content

feat: add return_as parameter for pandas/pyarrow/anndata output#38

Open
timtreis wants to merge 4 commits into
afermg:mainfrom
timtreis:feat/return-as-formats
Open

feat: add return_as parameter for pandas/pyarrow/anndata output#38
timtreis wants to merge 4 commits into
afermg:mainfrom
timtreis:feat/return-as-formats

Conversation

@timtreis

@timtreis timtreis commented Mar 27, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add return_as parameter to featurize() supporting "tuple" (default), "pandas", "pyarrow", and "anndata" output formats
  • All optional deps are lazy-imported — no hard dependencies added. Install via pip install cp_measure[pandas], cp_measure[anndata], etc.
  • AnnData output includes structured metadata: obs (image_id, object_type, label), var (feature_group, feature_type, feature_name, channel, channel_2), uns (config, channels, objects, is_3d)
  • PyArrow output stores per-column metadata in schema fields and table-level metadata (config, channels, is_3d)
  • Pandas output adds image_id, object_type, label as leading columns

Files changed

  • src/cp_measure/featurizer.pyreturn_as param, @overload type signatures, per-column metadata collection
  • src/cp_measure/_converters.py (new) — converter dispatch with lazy imports
  • pyproject.toml — optional dependency groups
  • test/test_return_as.py (new) — 31 tests covering all formats

…ta output

Add a `return_as` parameter to `featurize()` that supports four output
formats: "tuple" (default, backward-compatible), "pandas" (DataFrame),
"pyarrow" (Table with schema metadata), and "anndata" (AnnData with
structured obs/var/uns). All three optional formats are lazy-imported
to avoid hard dependencies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@timtreis timtreis force-pushed the feat/return-as-formats branch from 9ebae76 to 7d13002 Compare April 7, 2026 20:23
timtreis and others added 3 commits April 7, 2026 22:32
- Use raw tuple check instead of DataFrame iloc for None vs NaN safety in _to_anndata
- Add warning when 2D-only features are silently skipped on volumetric data
- Remove redundant no-op ternary in meta_entries initialization
- Remove unnecessary comment in _to_pyarrow

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@timtreis timtreis marked this pull request as ready for review April 7, 2026 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant