Skip to content

Add GapDetector for zero/NaN run captioning#31

Open
max-rosenblattl wants to merge 8 commits intomainfrom
max-rosenblattl/gap-detector
Open

Add GapDetector for zero/NaN run captioning#31
max-rosenblattl wants to merge 8 commits intomainfrom
max-rosenblattl/gap-detector

Conversation

@max-rosenblattl
Copy link
Copy Markdown
Collaborator

@max-rosenblattl max-rosenblattl commented Mar 30, 2026

GapDetector

♻️ Current situation & Problem

Long stretches of zero or NaN in wearable sensor data are silently ignored — statistical and structural extractors strip them before processing. This means captions never describe these periods, leaving a gap in the language-signal alignment.

⚙️ Release Notes

  • detectors/__init__.py — Extended DetectionResult with "gap" event type
  • detectors/gap.pyGapDetector: detects contiguous zero/NaN runs ≥ min_duration (default 30min, configured to 120min for MHC)
  • templates/templates.json — 5 observational "reads zero" templates
  • mhc/constants.pyGapDetector(min_duration=120) wired into all continuous channels

📚 Documentation

Overrides detect() rather than _detect() because the base StructuralDetector.detect() strips NaN/zeros — the GapDetector needs to look at them. Uses vectorized np.diff on a padded boolean mask for efficient run detection.

✅ Testing

Verified with synthetic data: correct gap boundaries, sub-threshold filtering, and end-to-end template formatting through StructuralExtractor.

Code of Conduct & Contributing Guidelines

By creating and submitting this pull request, you agree to follow our Code of Conduct and Contributing Guidelines:

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 30, 2026

📝 Walkthrough

Walkthrough

Adds Apple Watch non‑wear detection: new DeviceWearExtractor emits watch_wear annotations based on overlapping gaps in heart-rate and active-energy channels; integrates extractor into annotator/visualizer/captionizer pipelines, updates channel configs with watch_device, exposes a "watch" overlay in the Explorer, and adds tests.

Changes

Cohort / File(s) Summary
Configuration & Constants
mhc/constants.py, mhc_weekly/constants.py, extractors/__init__.py
Added WATCH_DEVICE_CHANNELS and a watch_device group in channel configs; extended VALID_CAPTION_TYPES to include "watch_wear".
Device Wear Extraction
extractors/device_wear.py
New DeviceWearExtractor (caption_type "watch_wear") that detects non‑wear windows by intersecting HR and active‑energy gap masks and emits annotations with windows and channel indices.
Integration into Pipelines
captionizer.py, visualizer.py
Added DeviceWearExtractor(MHC_CHANNEL_CONFIG) to Annotator pipelines in main execution paths so watch_wear annotations are produced for runs.
Interactive Explorer UI & Rendering
explorer.py
Integrated extractor into SensorExplorer, added "watch" overlay toggle and state, added SensorExplorer._device_wear_windows(...), updated hit-target/search logic to consider watch_wear, and render shaded non‑wear regions.
Extractor API & Structural Changes
extractors/structural.py
Removed MAX_EVENTS_PER_SIGNAL truncation so structural extractor emits all detected events per signal.
Tests & Docs
tests/test_device_wear_extractor.py, README.md
Added comprehensive unit tests for DeviceWearExtractor and updated README overlay docs to include the "watch" toggle.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Explorer as SensorExplorer
    participant Annotator
    participant DeviceWearExt as DeviceWearExtractor
    participant Recording
    participant UI as OverlayRenderer

    User->>Explorer: load recording
    Explorer->>Annotator: create (includes DeviceWearExtractor)
    Annotator->>DeviceWearExt: extract(recording)
    DeviceWearExt->>Recording: read watch_device channels (HR, ActiveEnergy)
    DeviceWearExt->>DeviceWearExt: build gap masks (NaN or 0)
    DeviceWearExt->>DeviceWearExt: intersect masks, filter by min duration
    DeviceWearExt-->>Annotator: return watch_wear annotations
    Annotator-->>Explorer: attach annotations to Recording
    User->>UI: toggle "watch" overlay
    UI->>Explorer: request device wear windows
    Explorer->>Explorer: _device_wear_windows(recording, signal_idx)
    Explorer->>UI: render axvspan shaded regions for each window
    UI-->>User: display non-wear regions
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested reviewers

  • milanagm
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add GapDetector for zero/NaN run captioning' is specific and clearly summarizes the main change—introducing a GapDetector for detecting and captioning zero/NaN runs in sensor data.
Description check ✅ Passed The description comprehensively explains the problem (ignoring long zero/NaN stretches), solution (GapDetector), and implementation details, which directly relate to the changeset across multiple files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch max-rosenblattl/gap-detector

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
detectors/gap.py (2)

36-37: Prefer explicit NotImplementedError in _detect for clarity.

Since this detector intentionally bypasses _detect, raising explicitly avoids silent failures if call paths change later.

✏️ Optional clarity tweak
     def _detect(self, series: np.ndarray, indices: np.ndarray) -> list[DetectionResult]:
-        return []
+        raise NotImplementedError("GapDetector implements detect() directly.")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@detectors/gap.py` around lines 36 - 37, The _detect method currently returns
an empty list which can silently hide incorrect call paths; replace the direct
return with raising NotImplementedError (e.g., in the _detect(series:
np.ndarray, indices: np.ndarray) -> list[DetectionResult] method) and include a
short message indicating this detector bypasses _detect so callers get an
explicit error if it is ever invoked.

33-33: Consider zip(..., strict=True) to enforce invariants between start/end arrays.

This will fail fast if the arrays ever diverge unexpectedly. Note that strict requires Python 3.10+; verify compatibility with your project's minimum Python version before applying.

🔧 Suggested change
-            for s, e in zip(starts[mask], ends[mask])
+            for s, e in zip(starts[mask], ends[mask], strict=True)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@detectors/gap.py` at line 33, The zip over filtered start/end arrays (the for
s, e in zip(starts[mask], ends[mask]) call in detectors/gap.py) should enforce
that the two sequences have equal length; update the call to zip(starts[mask],
ends[mask], strict=True) to fail fast if they diverge, and if your project must
support Python <3.10 instead of using strict=True add an explicit check (e.g.,
assert len(starts[mask]) == len(ends[mask]) or raise a ValueError) before the
zip to preserve the invariant.
mhc/constants.py (1)

54-60: Consider de-duplicating repeated GapDetector(min_duration=120) wiring.

This works, but repeating the same detector literal across all channels makes future duration changes error-prone.

♻️ Small DRY refactor
+GAP_MIN_DURATION = 120
+DEFAULT_STRUCT_DETECTORS = [TrendDetector(), SpikeDetector(), GapDetector(min_duration=GAP_MIN_DURATION)]
+
 MHC_CHANNEL_CONFIG = ChannelConfig(
@@
-        "hk_iphone:HKQuantityTypeIdentifierStepCount":              [TrendDetector(), SpikeDetector(), GapDetector(min_duration=120)],
-        "hk_iphone:HKQuantityTypeIdentifierDistanceWalkingRunning": [TrendDetector(), SpikeDetector(), GapDetector(min_duration=120)],
-        "hk_iphone:HKQuantityTypeIdentifierFlightsClimbed":         [TrendDetector(), SpikeDetector(), GapDetector(min_duration=120)],
-        "hk_watch:HKQuantityTypeIdentifierStepCount":               [TrendDetector(), SpikeDetector(), GapDetector(min_duration=120)],
-        "hk_watch:HKQuantityTypeIdentifierDistanceWalkingRunning":  [TrendDetector(), SpikeDetector(), GapDetector(min_duration=120)],
+        "hk_iphone:HKQuantityTypeIdentifierStepCount":              DEFAULT_STRUCT_DETECTORS,
+        "hk_iphone:HKQuantityTypeIdentifierDistanceWalkingRunning": DEFAULT_STRUCT_DETECTORS,
+        "hk_iphone:HKQuantityTypeIdentifierFlightsClimbed":         DEFAULT_STRUCT_DETECTORS,
+        "hk_watch:HKQuantityTypeIdentifierStepCount":               DEFAULT_STRUCT_DETECTORS,
+        "hk_watch:HKQuantityTypeIdentifierDistanceWalkingRunning":  DEFAULT_STRUCT_DETECTORS,
         "hk_watch:HKQuantityTypeIdentifierHeartRate":               [TrendDetector(filter_zeros=True), SpikeDetector(filter_zeros=True), GapDetector(min_duration=120)],
-        "hk_watch:HKQuantityTypeIdentifierActiveEnergyBurned":      [TrendDetector(), SpikeDetector(), GapDetector(min_duration=120)],
+        "hk_watch:HKQuantityTypeIdentifierActiveEnergyBurned":      DEFAULT_STRUCT_DETECTORS,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@mhc/constants.py` around lines 54 - 60, The mapping repeats
GapDetector(min_duration=120) across multiple channels; define a single reusable
instance (e.g., DEFAULT_GAP_DETECTOR = GapDetector(min_duration=120)) near the
top of mhc/constants.py and replace each literal GapDetector(min_duration=120)
in the channel lists (the mappings containing
"hk_iphone:HKQuantityTypeIdentifierStepCount",
"hk_watch:HKQuantityTypeIdentifierHeartRate", etc.) with that constant name to
avoid duplication and make future duration changes a single edit. Ensure the
constant is created before the mapping and used in all relevant lists.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@detectors/gap.py`:
- Around line 17-19: The constructor currently accepts any min_duration; add a
validation in the __init__ of the detector class (the method shown) to ensure
min_duration is a positive integer (>0) and raise ValueError with a clear
message if not; keep the existing super().__init__(filter_zeros=False) and
assignment self.min_duration = min_duration but perform the check before
assigning to self.min_duration (and optionally cast/validate type if needed).

In `@templates/templates.json`:
- Around line 31-36: Templates under the "gap" key currently assume gaps are
zeros, but GapDetector (detectors/gap.py, GapDetector) treats both NaN and 0 as
gaps; update the "gap" templates in templates/templates.json (the "gap" array)
to use neutral wording (e.g., "missing or zero", "no reading", or separate
variants for NaN-only vs zero-only) so captions correctly reflect NaN runs as
non-zero-mislabelled gaps; reference the GapDetector behavior when choosing
neutral phrasing or add distinct template sets (e.g., "gap_nan" and "gap_zero")
and ensure any template-selection logic later can pick the correct template type
if you add separate keys.

---

Nitpick comments:
In `@detectors/gap.py`:
- Around line 36-37: The _detect method currently returns an empty list which
can silently hide incorrect call paths; replace the direct return with raising
NotImplementedError (e.g., in the _detect(series: np.ndarray, indices:
np.ndarray) -> list[DetectionResult] method) and include a short message
indicating this detector bypasses _detect so callers get an explicit error if it
is ever invoked.
- Line 33: The zip over filtered start/end arrays (the for s, e in
zip(starts[mask], ends[mask]) call in detectors/gap.py) should enforce that the
two sequences have equal length; update the call to zip(starts[mask],
ends[mask], strict=True) to fail fast if they diverge, and if your project must
support Python <3.10 instead of using strict=True add an explicit check (e.g.,
assert len(starts[mask]) == len(ends[mask]) or raise a ValueError) before the
zip to preserve the invariant.

In `@mhc/constants.py`:
- Around line 54-60: The mapping repeats GapDetector(min_duration=120) across
multiple channels; define a single reusable instance (e.g., DEFAULT_GAP_DETECTOR
= GapDetector(min_duration=120)) near the top of mhc/constants.py and replace
each literal GapDetector(min_duration=120) in the channel lists (the mappings
containing "hk_iphone:HKQuantityTypeIdentifierStepCount",
"hk_watch:HKQuantityTypeIdentifierHeartRate", etc.) with that constant name to
avoid duplication and make future duration changes a single edit. Ensure the
constant is created before the mapping and used in all relevant lists.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8a095e90-5f33-4532-bae8-6d7e339001db

📥 Commits

Reviewing files that changed from the base of the PR and between a1b684e and 6d629cc.

📒 Files selected for processing (4)
  • detectors/__init__.py
  • detectors/gap.py
  • mhc/constants.py
  • templates/templates.json

Comment thread detectors/gap.py Outdated
Comment on lines +17 to +19
def __init__(self, min_duration: int = 30) -> None:
super().__init__(filter_zeros=False)
self.min_duration = min_duration
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Validate min_duration is positive.

A non-positive value can admit unintended detections; guard it at construction.

🛡️ Proposed guard
     def __init__(self, min_duration: int = 30) -> None:
         super().__init__(filter_zeros=False)
+        if min_duration <= 0:
+            raise ValueError("min_duration must be > 0")
         self.min_duration = min_duration
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@detectors/gap.py` around lines 17 - 19, The constructor currently accepts any
min_duration; add a validation in the __init__ of the detector class (the method
shown) to ensure min_duration is a positive integer (>0) and raise ValueError
with a clear message if not; keep the existing
super().__init__(filter_zeros=False) and assignment self.min_duration =
min_duration but perform the check before assigning to self.min_duration (and
optionally cast/validate type if needed).

Comment thread templates/templates.json Outdated
Comment on lines +31 to +36
"gap": [
"{name} reads zero from minute {start} to {end}.",
"{name} recorded constant zero values between minute {start} and {end}.",
"A flat-zero reading in {name} spans minute {start} through {end}.",
"Between minute {start} and {end}, {name} shows a sustained zero reading.",
"{name} data remains at zero from minute {start} to {end}."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Gap templates mislabel NaN runs as zero readings.

GapDetector treats both NaN and 0 as gaps (detectors/gap.py, Line 22), but these templates always say “zero.” NaN-only runs will produce incorrect captions.

💡 Proposed template wording update
-      "{name} reads zero from minute {start} to {end}.",
-      "{name} recorded constant zero values between minute {start} and {end}.",
-      "A flat-zero reading in {name} spans minute {start} through {end}.",
-      "Between minute {start} and {end}, {name} shows a sustained zero reading.",
-      "{name} data remains at zero from minute {start} to {end}."
+      "{name} has missing or zero readings from minute {start} to {end}.",
+      "{name} recorded a sustained missing/zero interval between minute {start} and {end}.",
+      "A contiguous missing-or-zero run in {name} spans minute {start} through {end}.",
+      "Between minute {start} and {end}, {name} shows a sustained missing/zero period.",
+      "{name} data remains missing or zero from minute {start} to {end}."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@templates/templates.json` around lines 31 - 36, Templates under the "gap" key
currently assume gaps are zeros, but GapDetector (detectors/gap.py, GapDetector)
treats both NaN and 0 as gaps; update the "gap" templates in
templates/templates.json (the "gap" array) to use neutral wording (e.g.,
"missing or zero", "no reading", or separate variants for NaN-only vs zero-only)
so captions correctly reflect NaN runs as non-zero-mislabelled gaps; reference
the GapDetector behavior when choosing neutral phrasing or add distinct template
sets (e.g., "gap_nan" and "gap_zero") and ensure any template-selection logic
later can pick the correct template type if you add separate keys.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
extractors/device_wear.py (2)

122-126: Add strict=True to zip() for defensive safety.

Per static analysis hint, zip() without strict= can silently truncate if starts and ends have different lengths. While they should always match given the diff-based derivation, adding strict=True catches any future edge cases.

🔧 Proposed fix
     windows = [
         (int(start), int(end))
-        for start, end in zip(starts.tolist(), ends.tolist())
+        for start, end in zip(starts.tolist(), ends.tolist(), strict=True)
         if end - start + 1 >= min_nonwear_duration
     ]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extractors/device_wear.py` around lines 122 - 126, Update the list
comprehension that builds windows to call zip with strict=True to ensure starts
and ends lengths must match; specifically change the comprehension using windows
= [ (int(start), int(end)) for start, end in zip(starts.tolist(), ends.tolist())
if end - start + 1 >= min_nonwear_duration ] to use zip(starts.tolist(),
ends.tolist(), strict=True). Ensure this modification is applied where the
variables starts, ends, min_nonwear_duration and the windows assignment occur.

102-113: Consider the restrictive watch_device overlap logic.

For watch_device, the overlap requires both heart rate AND active energy channels to have gaps at the same minute (line 111). If either channel is missing from the recording or lacks a GapDetector, no windows are returned (lines 105-110).

This may be intentionally conservative, but it could miss valid non-wear periods when only one of these channels is available. Consider whether a fallback to the general coverage-based logic would be appropriate when only one key channel is present.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extractors/device_wear.py` around lines 102 - 113, The current watch_device
branch requires both heart rate and active energy masks to exist and uses their
intersection, returning [] if either channel or mask is missing; change this so
that if both hr and active_energy masks exist you keep overlap_mask = hr_mask &
active_energy_mask, but if one or both channels/masks are missing you fall back
to the general coverage rule (overlap_mask = coverage >=
min_overlapping_channels) instead of returning an empty list; update the logic
in the block handling group_name == "watch_device" around
_heart_rate_channel_index, _active_energy_channel_index, and channel_masks
lookups to implement this fallback.
tests/test_device_wear_extractor.py (2)

83-101: test_steps_do_not_substitute_for_active_energy currently duplicates the previous scenario.

Right now, Line 84-89 and Line 94-99 define the same input, so both tests assert the same behavior. Consider making test_watch_requires_active_energy_gap use no steps gap, and keeping the steps-gap case in test_steps_do_not_substitute_for_active_energy.

♻️ Suggested test-data split to avoid duplication
 def test_watch_requires_active_energy_gap(self) -> None:
     values = np.array([
         [1, 0, 0, 0, 1, 1, 1, 1],
         [1, 1, 1, 1, 1, 1, 1, 1],
-        [1, 1, 0, 0, 0, 1, 1, 1],
+        [1, 1, 1, 1, 1, 1, 1, 1],
         [1, 1, 1, 1, 1, 1, 1, 1],
     ])
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_device_wear_extractor.py` around lines 83 - 101, The two tests
test_watch_requires_active_energy_gap and
test_steps_do_not_substitute_for_active_energy currently use identical input
arrays; change the input for test_watch_requires_active_energy_gap to represent
the "no steps gap" scenario and leave
test_steps_do_not_substitute_for_active_energy as the "steps-gap" scenario so
they exercise different logic. Concretely, modify the values passed to
_recording(self.names) in test_watch_requires_active_energy_gap to remove/alter
the steps-gap pattern (e.g., make the middle row or column show continuous
activity without the step-gap pattern), while keeping the current values in
test_steps_do_not_substitute_for_active_energy; this ensures extractor.extract
and assertions remain the same but the tests validate distinct cases.

49-124: Add one NaN-based watch-gap test to cover the stated detector behavior.

All current cases use zeros only. Adding a NaN variant would lock in the zero/NaN requirement end-to-end for DeviceWearExtractor.

🧪 Suggested additional test
+    def test_hr_and_active_energy_nan_gaps_produce_annotation(self) -> None:
+        values = np.array([
+            [1, np.nan, np.nan, 0, 1, 1, 1, 1],
+            [1, 1, np.nan, np.nan, 0, 1, 1, 1],
+            [1, 1, 1, 1, 1, 1, 1, 1],
+            [1, 1, 1, 1, 1, 1, 1, 1],
+        ])
+        annotations = self.extractor.extract(_recording(values, self.names))
+        self.assertEqual(len(annotations), 1)
+        self.assertEqual(annotations[0].window, (2, 3))
+        self.assertEqual(annotations[0].caption_type, "watch_wear")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_device_wear_extractor.py` around lines 49 - 124, Add a new unit
test that mirrors an existing zero-based watch-gap case but replaces the gap (0)
entries with numpy.nan to assert DeviceWearExtractor treats NaN as a gap;
specifically, create a test (e.g., test_watch_gap_with_nan) that builds the same
ChannelConfig/GapDetector setup used in the other tests, feeds a recording whose
heart-rate or active-energy channel contains NaNs where zeros previously were,
calls DeviceWearExtractor.extract, and asserts the expected annotations (or
empty list) so the extractor.extract and GapDetector(min_duration=...) behavior
for NaN gaps is covered end-to-end.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@extractors/device_wear.py`:
- Around line 122-126: Update the list comprehension that builds windows to call
zip with strict=True to ensure starts and ends lengths must match; specifically
change the comprehension using windows = [ (int(start), int(end)) for start, end
in zip(starts.tolist(), ends.tolist()) if end - start + 1 >=
min_nonwear_duration ] to use zip(starts.tolist(), ends.tolist(), strict=True).
Ensure this modification is applied where the variables starts, ends,
min_nonwear_duration and the windows assignment occur.
- Around line 102-113: The current watch_device branch requires both heart rate
and active energy masks to exist and uses their intersection, returning [] if
either channel or mask is missing; change this so that if both hr and
active_energy masks exist you keep overlap_mask = hr_mask & active_energy_mask,
but if one or both channels/masks are missing you fall back to the general
coverage rule (overlap_mask = coverage >= min_overlapping_channels) instead of
returning an empty list; update the logic in the block handling group_name ==
"watch_device" around _heart_rate_channel_index, _active_energy_channel_index,
and channel_masks lookups to implement this fallback.

In `@tests/test_device_wear_extractor.py`:
- Around line 83-101: The two tests test_watch_requires_active_energy_gap and
test_steps_do_not_substitute_for_active_energy currently use identical input
arrays; change the input for test_watch_requires_active_energy_gap to represent
the "no steps gap" scenario and leave
test_steps_do_not_substitute_for_active_energy as the "steps-gap" scenario so
they exercise different logic. Concretely, modify the values passed to
_recording(self.names) in test_watch_requires_active_energy_gap to remove/alter
the steps-gap pattern (e.g., make the middle row or column show continuous
activity without the step-gap pattern), while keeping the current values in
test_steps_do_not_substitute_for_active_energy; this ensures extractor.extract
and assertions remain the same but the tests validate distinct cases.
- Around line 49-124: Add a new unit test that mirrors an existing zero-based
watch-gap case but replaces the gap (0) entries with numpy.nan to assert
DeviceWearExtractor treats NaN as a gap; specifically, create a test (e.g.,
test_watch_gap_with_nan) that builds the same ChannelConfig/GapDetector setup
used in the other tests, feeds a recording whose heart-rate or active-energy
channel contains NaNs where zeros previously were, calls
DeviceWearExtractor.extract, and asserts the expected annotations (or empty
list) so the extractor.extract and GapDetector(min_duration=...) behavior for
NaN gaps is covered end-to-end.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 751cf8e1-058b-4681-b718-3e010fbe0d70

📥 Commits

Reviewing files that changed from the base of the PR and between 6d629cc and 4dbdb48.

📒 Files selected for processing (13)
  • .gitignore
  • README.md
  • captionizer.py
  • detectors/__init__.py
  • explorer.py
  • extractors/__init__.py
  • extractors/device_wear.py
  • extractors/structural.py
  • mhc/constants.py
  • mhc_weekly/constants.py
  • templates/templates.json
  • tests/test_device_wear_extractor.py
  • visualizer.py
💤 Files with no reviewable changes (1)
  • extractors/structural.py
✅ Files skipped from review due to trivial changes (3)
  • .gitignore
  • README.md
  • templates/templates.json

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
extractors/device_wear.py (1)

18-26: Validate non-wear thresholds up front.

Non-positive min_overlapping_channels or min_nonwear_duration currently turns a config mistake into overly permissive detections. Failing fast here will make bad settings obvious.

Suggested change
     def __init__(
         self,
         config: ChannelConfig,
         min_overlapping_channels: int = 2,
         min_nonwear_duration: int = 30,
     ):
         super().__init__(config)
+        if min_overlapping_channels <= 0:
+            raise ValueError("min_overlapping_channels must be positive")
+        if min_nonwear_duration <= 0:
+            raise ValueError("min_nonwear_duration must be positive")
         self.min_overlapping_channels = min_overlapping_channels
         self.min_nonwear_duration = min_nonwear_duration
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extractors/device_wear.py` around lines 18 - 26, The constructor (__init__)
accepts min_overlapping_channels and min_nonwear_duration but does not validate
them; add upfront validation in the __init__ of the device wear extractor to
ensure both min_overlapping_channels and min_nonwear_duration are positive
integers (>0) and raise a ValueError with a clear message if not (also consider
TypeError if non-integers), so invalid configs fail fast instead of producing
permissive detections.
tests/test_device_wear_extractor.py (1)

49-124: Add one boundary-window test for the padded diff logic.

Current coverage locks in middle-of-series runs only. A leading/trailing overlap case would protect _mask_to_windows() against off-by-one regressions at both edges.

Suggested test
 class DeviceWearExtractorTest(unittest.TestCase):
@@
     def test_missing_watch_group_returns_no_annotations(self) -> None:
         config = ChannelConfig(
             names=list(self.names),
             meta={name: (name, "", 1) for name in self.names},
             continuous=frozenset(self.names),
             groups={},
         )
         extractor = DeviceWearExtractor(config, min_nonwear_duration=2)
         values = np.array([
             [1, 1, 1, 1],
             [1, 1, 1, 1],
             [70, np.nan, np.nan, 73],
             [5, 0, 0, 8],
             [1, 1, 1, 1],
         ])
         annotations = extractor.extract(_recording(values, self.names))
         self.assertEqual(annotations, [])
+
+    def test_nonwear_at_series_boundaries_is_preserved(self) -> None:
+        values = np.array([
+            [1, 1, 1, 1, 1],
+            [1, 1, 1, 1, 1],
+            [np.nan, np.nan, 70, np.nan, np.nan],
+            [0, 0, 5, 0, 0],
+            [1, 1, 1, 1, 1],
+        ])
+        annotations = self.extractor.extract(_recording(values, self.names))
+        self.assertEqual([annotation.window for annotation in annotations], [(0, 1), (3, 4)])
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_device_wear_extractor.py` around lines 49 - 124, Add a unit test
in tests/test_device_wear_extractor.py that targets the edge-case behavior of
the padded-diff windowing used by _mask_to_windows(): create a
DeviceWearExtractor (or reuse self.extractor) with min_nonwear_duration set so
short runs are considered, craft a recording where HR/active-energy overlap
occurs at the very start (leading) and another where it occurs at the very end
(trailing) of the series, call extractor.extract(_recording(..., self.names))
and assert that each produces exactly the expected annotation window (boundary
indices) and channel_idxs so off-by-one errors at the series edges would fail
the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@extractors/device_wear.py`:
- Around line 18-26: The constructor (__init__) accepts min_overlapping_channels
and min_nonwear_duration but does not validate them; add upfront validation in
the __init__ of the device wear extractor to ensure both
min_overlapping_channels and min_nonwear_duration are positive integers (>0) and
raise a ValueError with a clear message if not (also consider TypeError if
non-integers), so invalid configs fail fast instead of producing permissive
detections.

In `@tests/test_device_wear_extractor.py`:
- Around line 49-124: Add a unit test in tests/test_device_wear_extractor.py
that targets the edge-case behavior of the padded-diff windowing used by
_mask_to_windows(): create a DeviceWearExtractor (or reuse self.extractor) with
min_nonwear_duration set so short runs are considered, craft a recording where
HR/active-energy overlap occurs at the very start (leading) and another where it
occurs at the very end (trailing) of the series, call
extractor.extract(_recording(..., self.names)) and assert that each produces
exactly the expected annotation window (boundary indices) and channel_idxs so
off-by-one errors at the series edges would fail the test.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 91cda561-d2d4-4500-8b1a-995e155cec66

📥 Commits

Reviewing files that changed from the base of the PR and between 4dbdb48 and c59adfc.

📒 Files selected for processing (6)
  • README.md
  • explorer.py
  • extractors/device_wear.py
  • mhc/constants.py
  • mhc_weekly/constants.py
  • tests/test_device_wear_extractor.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • README.md
  • mhc/constants.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
explorer.py (1)

619-621: Minor inconsistency in display labels for watch wear.

The display labels for watch wear are inconsistent across the UI:

  • Line 59 (_hit_target_label): "watch non-wear" (space-separated)
  • Line 536 (legend): "WatchNonWear" (CamelCase)
  • Line 619 (captions): "watch_nonwear" (underscore)

Consider unifying these for a consistent user experience.

♻️ Suggested unified naming
-                    label="WatchNonWear",
+                    label="watch non-wear",
-            display_type = "watch_nonwear" if caption_type == "watch_wear" else caption_type
+            display_type = "watch non-wear" if caption_type == "watch_wear" else caption_type
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@explorer.py` around lines 619 - 621, Summary: Inconsistent labeling for watch
wear/non-wear across _hit_target_label, the legend, and the caption generation
using caption_type/display_type. Fix: pick one canonical label (e.g., "watch
non-wear") and update all places to use it; specifically change how
caption_type/display_type are mapped in the loop that builds caption_lines (the
display_type assignment and its use in caption_lines.append) to produce the
canonical string, and also update the definitions/uses in _hit_target_label and
the legend generation to the same exact token so all UI locations match.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@explorer.py`:
- Around line 619-621: Summary: Inconsistent labeling for watch wear/non-wear
across _hit_target_label, the legend, and the caption generation using
caption_type/display_type. Fix: pick one canonical label (e.g., "watch
non-wear") and update all places to use it; specifically change how
caption_type/display_type are mapped in the loop that builds caption_lines (the
display_type assignment and its use in caption_lines.append) to produce the
canonical string, and also update the definitions/uses in _hit_target_label and
the legend generation to the same exact token so all UI locations match.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 424afb32-d5ac-4629-995f-5995f17df9b6

📥 Commits

Reviewing files that changed from the base of the PR and between c59adfc and 1d31aa5.

📒 Files selected for processing (1)
  • explorer.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants