Skip to content

Cross-Channel Extractors#49

Open
KarlDeck wants to merge 27 commits intomainfrom
KarlDeck/Sleep-Bundles
Open

Cross-Channel Extractors#49
KarlDeck wants to merge 27 commits intomainfrom
KarlDeck/Sleep-Bundles

Conversation

@KarlDeck
Copy link
Copy Markdown
Collaborator

@KarlDeck KarlDeck commented Apr 9, 2026

Cross-channel extractor and sleep synthesizer

♻️ Current situation & Problem

This PR introduces an initial raw version of sleep-related bundle captions built on top of the current signal representation and annotation pipeline.

The goal is to capture higher-level multi-channel context, like:

  • in bed but not asleep
  • more coherent sleep-related summaries across relevant channels

This is intended as an exploratory first version. More different cross-channel c.aptions are to follow this example. In reviewing these specific bundle outputs, one thing that comes up is that the detector may sometimes fire during the day when someone briefly lies down. In those cases, the label can still be technically correct because the person is in bed but not asleep. That said, this behavior could also be filtered later by adding a minimum bed-stay duration before such a caption qualifies. For this PR, the intention is to keep the behavior relatively raw rather than aggressively filtered.

⚙️ Release Notes

  • Add initial raw sleep-bundle caption support
  • Add support for identifying and captioning in bed but awake style intervals
  • Establish a baseline implementation that can later be refined with stricter duration thresholds and filtering rules

Potential follow-up refinement:

  • Require a minimum in bed duration before generating a sleep-related bundle caption to reduce short daytime lay-down detections

📚 Documentation

This PR is meant as a first-pass implementation of sleep bundles and focuses on getting the raw behavior into the pipeline so we can inspect outputs and iterate.

A notable design choice is that short daytime bed intervals are currently allowed to surface as captions when the underlying label is still semantically correct. We may later tighten this by introducing duration-based gating, but for now this version is intentionally permissive so we can better understand the behavior of the raw detector and captioning logic.

✅ Testing

Testing for this PR focused primarily on implementation and output inspection of the raw caption behavior.

Important note:

  • Some generated captions may correspond to short daytime bed stays
  • This is currently expected behavior in the raw version
  • If needed, this can be refined in a follow-up by adding a minimum duration threshold for bed intervals before caption generation

Code of Conduct & Contributing Guidelines

By creating and submitting this pull request, you agree to follow our Code of Conduct and Contributing Guidelines:

@KarlDeck KarlDeck requested a review from max-rosenblattl April 9, 2026 18:09
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 9, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a CrossChannelExtractor and three CrossChannelSynthesizers (Sleep, StationaryActivity, Cardio); extends templates and VALID_CAPTION_TYPES; wires the extractor into captionizer/visualizer/explorer; reworks explorer target navigation/rendering for cross-channel windows; updates spike/trend detectors and detector configs; adds data/ to .gitignore.

Changes

Cohort / File(s) Summary
Ignore rules
\.gitignore
Added data/ and .DS_Store to ignored paths.
Captioning pipeline wiring
captionizer.py, visualizer.py
Integrated CrossChannelExtractor(MHC_CHANNEL_CONFIG, synthesizers=[SleepSynthesizer(), StationaryActivitySynthesizer(), CardioSynthesizer()]) into Annotator bootstrap.
Extractors core
extractors/__init__.py, extractors/cross_channel.py
Added "cross_channel" to VALID_CAPTION_TYPES; new CrossChannelExtractor aggregates annotations from provided synthesizers.
Synthesizer infra & utils
synthesizers/__init__.py
Added abstract CrossChannelSynthesizer interface and contiguous_windows(mask, min_duration) utility.
Sleep synthesizer
synthesizers/sleep.py
New SleepSynthesizer detecting in-bed-but-awake windows from sleep:inbed/sleep:asleep channels and emitting cross_channel annotations.
Stationary activity synthesizer
synthesizers/stationary_activity.py
New StationaryActivitySynthesizer detecting HIIT/strength/functional stationary workout windows, computing optional metrics, and emitting cross_channel annotations.
Cardio synthesizer
synthesizers/cardio.py
New CardioSynthesizer detecting running/cycling workout windows, computing HR/distance/steps metrics and emitting cross_channel annotations.
Templates
templates/templates.json
Removed structural.drop templates; added top-level cross_channel templates for sleep, stationary_activity (hiit/strength/functional), and cardio (running/cycling).
Explorer UI & rendering
explorer.py
Reworked hit-target model to support CROSS_CHANNEL_TARGETS, replaced per-detector buttons with cycling controls, added _row_hit_signal_indices, _cross_channel_windows, updated _jump_to_hit() and render() to draw axvspan overlays for cross-channel windows; adjusted hit navigation/labels and spike labeling.
Detectors adjustments
detectors/__init__.py, detectors/spike.py, detectors/trend.py
Removed "drop" from DetectionResult.event_type; simplified SpikeDetector to single-pass peak detection via find_peaks (emit only spike), removed many old helpers; TrendDetector post-merge now resolves containment of opposite-direction segments.
Structural extractor
extractors/structural.py
Removed MAX_EVENTS_PER_SIGNAL truncation; now preserves all detected events per signal.
MHC configs
mhc/constants.py, mhc_weekly/constants.py
Removed spike-detector factory helpers; updated per-channel SpikeDetector constructor args to new min_height/min_distance/top_k/min_prominence patterns.
Docs
README.md
Updated Explorer startup example to include --min-wear-pct=50.0.

Sequence Diagram

sequenceDiagram
    actor User
    participant Explorer
    participant Annotator
    participant CCExtractor as CrossChannelExtractor
    participant SleepSynth as SleepSynthesizer
    participant StationarySynth as StationaryActivitySynthesizer
    participant CardioSynth as CardioSynthesizer
    participant Templates

    User->>Explorer: open recording / request annotations
    Explorer->>Annotator: extract(recording)
    Annotator->>CCExtractor: extract(recording)
    CCExtractor->>SleepSynth: synthesize(recording, config)
    CCExtractor->>StationarySynth: synthesize(recording, config)
    CCExtractor->>CardioSynth: synthesize(recording, config)
    SleepSynth->>Templates: load cross_channel.sleep templates
    StationarySynth->>Templates: load cross_channel.stationary_activity templates
    CardioSynth->>Templates: load cross_channel.cardio templates
    Templates-->>SleepSynth: template strings
    Templates-->>StationarySynth: template strings
    Templates-->>CardioSynth: template strings
    SleepSynth-->>CCExtractor: annotations (sleep windows)
    StationarySynth-->>CCExtractor: annotations (stationary activity windows)
    CardioSynth-->>CCExtractor: annotations (cardio windows)
    CCExtractor-->>Annotator: aggregated annotations
    Annotator-->>Explorer: annotated recording
    Explorer->>Explorer: render() draws axvspan overlays for cross-channel windows
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested reviewers

  • max-rosenblattl
  • RealLast
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Cross-Channel Extractors' directly and concisely describes the main addition: a new cross-channel extractor framework with sleep synthesizer support.
Description check ✅ Passed The description is well-related to the changeset, explaining the problem (capturing multi-channel context like 'in bed but not asleep'), the solution (sleep synthesizer), and design choices.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch KarlDeck/Sleep-Bundles

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@max-rosenblattl max-rosenblattl changed the title In Bed but not Asleep Bundle Cross-channel extractor and sleep synthesizer Apr 12, 2026
@max-rosenblattl max-rosenblattl force-pushed the KarlDeck/Sleep-Bundles branch from 8a21172 to e457736 Compare April 12, 2026 06:24
@max-rosenblattl max-rosenblattl marked this pull request as ready for review April 12, 2026 06:29
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
.gitignore (1)

6-6: Consider adding a newline at end of file.

The file currently ends without a newline character. Adding one aligns with POSIX standards and prevents diff noise in future changes.

📝 Proposed fix
-data/
+data/
+
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.gitignore at line 6, Add a trailing newline at the end of the .gitignore by
ensuring the final "data/" entry is followed by a newline character (i.e., end
the file with a blank line) so the file conforms to POSIX conventions and avoids
future diff noise.
synthesizers/sleep.py (1)

35-37: Avoid reading and decoding templates.json for every row.

SleepSynthesizer.synthesize() runs on the per-row annotation path in captionizer and explorer. Re-reading the same template file here adds unnecessary I/O and JSON parsing overhead; load it once in __init__ or memoize it by config.templates_path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@synthesizers/sleep.py` around lines 35 - 37, The synthesize method currently
re-reads and parses the templates JSON for every row (templates =
json.loads(config.templates_path.read_text())...), causing heavy I/O; move this
load into SleepSynthesizer.__init__ (e.g., self.templates =
json.loads(config.templates_path.read_text())["cross_channel"]["sleep"]) or
implement a simple memoization keyed by config.templates_path so synthesize uses
self.templates instead of re-reading; update references in
SleepSynthesizer.synthesize() (and any callers) to use the cached templates
while keeping the rest of the logic (including time_unit and
CaptionExtractor._seed usage) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@synthesizers/sleep.py`:
- Around line 19-20: The constructor for SleepSynthesizer currently defaults
min_duration to 5 which unintentionally filters short (<5-sample) intervals;
change the default in SleepSynthesizer.__init__ to a permissive value (e.g., 0)
so SleepSynthesizer() yields raw/permissive output and does not drop
sub-5-sample “in bed but awake” windows, and keep the min_duration parameter
available for callers that want to opt into filtering.
- Around line 31-33: The current guard requires asleep to have a positive value
before emitting awake-in-bed intervals; change it to only require that asleep
has any observed (non-NaN) samples. Replace the condition "if not
np.any((~np.isnan(asleep)) & (asleep > 0)):" with a check that tests for any
non-NaN in asleep (e.g., "if not np.any(~np.isnan(asleep)):"), leaving the
subsequent mask expression using asleep > 0 and in_bed unchanged so awake-in-bed
windows are emitted when asleep is observed but stays zero.

In `@visualizer.py`:
- Around line 125-129: The plot_row() function is only rendering captions for
annotation types 'statistical', 'structural', and 'semantic', so annotations
produced by CrossChannelExtractor (created in the Annotator instantiation
alongside StatisticalExtractor and StructuralExtractor) are being dropped;
update plot_row() to recognize and render the 'cross_channel' annotation type
(or whatever key CrossChannelExtractor uses) in the same way as the others—add
the caption/layout branch or include 'cross_channel' in the type-handling list
and ensure its text and markers are drawn to the saved plot so sleep-bundle
annotations become visible.

---

Nitpick comments:
In @.gitignore:
- Line 6: Add a trailing newline at the end of the .gitignore by ensuring the
final "data/" entry is followed by a newline character (i.e., end the file with
a blank line) so the file conforms to POSIX conventions and avoids future diff
noise.

In `@synthesizers/sleep.py`:
- Around line 35-37: The synthesize method currently re-reads and parses the
templates JSON for every row (templates =
json.loads(config.templates_path.read_text())...), causing heavy I/O; move this
load into SleepSynthesizer.__init__ (e.g., self.templates =
json.loads(config.templates_path.read_text())["cross_channel"]["sleep"]) or
implement a simple memoization keyed by config.templates_path so synthesize uses
self.templates instead of re-reading; update references in
SleepSynthesizer.synthesize() (and any callers) to use the cached templates
while keeping the rest of the logic (including time_unit and
CaptionExtractor._seed usage) unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5a2f7a21-e5a2-422f-8152-55d48ebb151f

📥 Commits

Reviewing files that changed from the base of the PR and between b268ec4 and e457736.

📒 Files selected for processing (9)
  • .gitignore
  • captionizer.py
  • explorer.py
  • extractors/__init__.py
  • extractors/cross_channel.py
  • synthesizers/__init__.py
  • synthesizers/sleep.py
  • templates/templates.json
  • visualizer.py

Comment thread synthesizers/sleep.py Outdated
Comment thread synthesizers/sleep.py Outdated
Comment thread visualizer.py
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@explorer.py`:
- Around line 414-452: The scan loop never revisits the starting row after a
full cycle because row_offset only iterates 0..n_rows-1 and the special-case
"elif row_offset == 0: continue" prevents wrapping back to the current row; fix
by changing the loop to iterate one extra step (use range(1, n_rows+1) or
otherwise include the wrap iteration) and remove the early-skip so that when
row_index == self.row_index and there are no candidates after the current signal
you still allow selecting hit_signals[0] (for step>0) or hit_signals[-1] (for
step<0) on the wrapped iteration; update the loop that uses start_row,
row_offset, and row_index and keep references to self._row_hit_signal_indices,
self.row_index, self.signal_index, and self.hit_target intact so the code will
correctly jump back to hits on the original row after scanning the rest of the
dataset.

In `@synthesizers/cardio.py`:
- Around line 19-21: The distance channel is hard-coded to DISTANCE_CHANNEL =
"hk_watch:HKQuantityTypeIdentifierDistanceWalkingRunning" and that same
distance_idx is reused for cycling, so cycling workouts miss distance data; add
a new constant (e.g., CYCLING_DISTANCE_CHANNEL =
"hk_watch:HKQuantityTypeIdentifierDistanceCycling"), update the lookup logic
where distance_idx is computed and used (references: DISTANCE_CHANNEL and
distance_idx) to choose the cycling channel when the workout type is cycling
(e.g., check the workout type or a workout.kind value) and fall back to the
walking/running channel otherwise, ensuring downstream code that reads distance
uses the correct index for cycling workouts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c9366eb0-3611-4bb7-84ba-94c5b0b09c94

📥 Commits

Reviewing files that changed from the base of the PR and between e4320e5 and 08af225.

📒 Files selected for processing (5)
  • captionizer.py
  • explorer.py
  • synthesizers/cardio.py
  • templates/templates.json
  • visualizer.py
✅ Files skipped from review due to trivial changes (2)
  • captionizer.py
  • templates/templates.json
🚧 Files skipped from review as they are similar to previous changes (1)
  • visualizer.py

Comment thread explorer.py
Comment thread synthesizers/cardio.py Outdated
@KarlDeck KarlDeck changed the title Cross-channel extractor and sleep synthesizer Cross-Channel Extractors Apr 12, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
detectors/spike.py (1)

27-28: Remove unused parameters max_cluster_peaks and soft_top_k_margin.

These parameters are stored in __init__ but never referenced in _detect or _build_results. If they're placeholders for future functionality, consider adding a TODO comment; otherwise, remove them to avoid confusion.

♻️ Proposed fix to remove unused parameters
     def __init__(
         self,
         filter_zeros: bool = False,
         min_height: float | None = None,
         min_prominence: float = 0.0,
         min_threshold: float | None = None,
         min_distance: int = 1,
         min_width: int = 1,
         top_k: int | None = None,
-        max_cluster_peaks: int = 2,
-        soft_top_k_margin: float = 0.90,
     ) -> None:
         super().__init__(filter_zeros=filter_zeros)
         self.min_height = min_height
         self.min_prominence = min_prominence
         self.min_threshold = min_threshold
         self.min_distance = max(1, min_distance)
         self.min_width = max(1, min_width)
         self.top_k = top_k
-        self.max_cluster_peaks = max(1, max_cluster_peaks)
-        self.soft_top_k_margin = min(max(float(soft_top_k_margin), 0.0), 1.0)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@detectors/spike.py` around lines 27 - 28, The constructor of the spike
detector currently accepts and stores max_cluster_peaks and soft_top_k_margin
but they are never used; remove these parameters from the __init__ signature and
delete the corresponding attribute assignments (search for max_cluster_peaks and
soft_top_k_margin in the class, e.g., in __init__ of the SpikeDetector) and any
related comments or default values; if these were meant as future hooks, instead
leave a single TODO comment in __init__ mentioning planned clustering/soft-top-k
behaviour and remove the unused attributes to avoid confusion.
explorer.py (1)

403-419: lru_cache on instance method may cause memory retention.

The @lru_cache(maxsize=256) decorator on _row_hit_signal_indices uses self as part of the cache key. While the bounded maxsize limits growth, the cache will hold references to SensorExplorer instances even after they should be garbage collected.

For this interactive single-instance application, the impact is minimal. If multi-instance usage becomes a concern, consider using functools.cached_property patterns or a per-instance cache dict.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@explorer.py` around lines 403 - 419, The method _row_hit_signal_indices is
decorated with `@lru_cache` which captures self in the cache key and can retain
SensorExplorer instances; remove the `@lru_cache` decorator and implement a
per-instance bounded cache instead (e.g. add self._row_hit_signal_indices_cache
= OrderedDict() or a plain dict with manual eviction in the class init), then
modify _row_hit_signal_indices to check the per-instance cache using (row_index,
target_name) as the key, compute the result using the existing logic that calls
_load_row_bundle, _cross_channel_windows, and _row_detector_events when missing,
store the tuple in the per-instance cache (evict oldest when exceeding maxsize
256), and optionally clear the cache on instance cleanup.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@detectors/spike.py`:
- Around line 27-28: The constructor of the spike detector currently accepts and
stores max_cluster_peaks and soft_top_k_margin but they are never used; remove
these parameters from the __init__ signature and delete the corresponding
attribute assignments (search for max_cluster_peaks and soft_top_k_margin in the
class, e.g., in __init__ of the SpikeDetector) and any related comments or
default values; if these were meant as future hooks, instead leave a single TODO
comment in __init__ mentioning planned clustering/soft-top-k behaviour and
remove the unused attributes to avoid confusion.

In `@explorer.py`:
- Around line 403-419: The method _row_hit_signal_indices is decorated with
`@lru_cache` which captures self in the cache key and can retain SensorExplorer
instances; remove the `@lru_cache` decorator and implement a per-instance bounded
cache instead (e.g. add self._row_hit_signal_indices_cache = OrderedDict() or a
plain dict with manual eviction in the class init), then modify
_row_hit_signal_indices to check the per-instance cache using (row_index,
target_name) as the key, compute the result using the existing logic that calls
_load_row_bundle, _cross_channel_windows, and _row_detector_events when missing,
store the tuple in the per-instance cache (evict oldest when exceeding maxsize
256), and optionally clear the cache on instance cleanup.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e5a7aee8-f7a7-49eb-aa5a-ba3a9c9dce75

📥 Commits

Reviewing files that changed from the base of the PR and between 08af225 and 48927f1.

📒 Files selected for processing (10)
  • .gitignore
  • README.md
  • detectors/__init__.py
  • detectors/spike.py
  • detectors/trend.py
  • explorer.py
  • extractors/structural.py
  • mhc/constants.py
  • mhc_weekly/constants.py
  • templates/templates.json
💤 Files with no reviewable changes (1)
  • extractors/structural.py
✅ Files skipped from review due to trivial changes (1)
  • README.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • .gitignore

@max-rosenblattl max-rosenblattl mentioned this pull request Apr 12, 2026
1 task
Comment thread synthesizers/cardio.py Outdated
Comment thread synthesizers/cardio.py Outdated
Comment thread synthesizers/cardio.py Outdated
Comment thread synthesizers/cardio.py
Comment thread templates/templates.json Outdated
Comment thread templates/templates.json Outdated
@max-rosenblattl
Copy link
Copy Markdown
Collaborator

The CrossChannelExtractor is included in the weekly pipeline, but templates_hourly.json doesn't have a cross_channel key — so --weekly crashes with a KeyError. Cross-channel synthesis doesn't make sense for hourly-aggregated data anyway; probably easiest to skip it when --weekly is set.

Comment thread synthesizers/cardio.py Outdated
hr_summary.append("the heartrate was elevated during this phase")
hr_summary.append(f"peak HR {hr_peak:.0f} bpm")
else:
hr_summary.append(f"avg HR {hr_mean:.0f} bpm")
Copy link
Copy Markdown
Collaborator

@max-rosenblattl max-rosenblattl Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hr_summary.append(f"avg HR {hr_mean:.0f} bpm") this line is added twice in each if branch

I think more fluent German sentences fit the current captions better than the current abbreviations.
"averaging a heart rate of HR bpm, peaking at HR..."

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets use the meta channel config for that, we defined it somewhere

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made sentences more fluid in 9f72309

Comment thread synthesizers/cardio.py Outdated
parts.append(f"this means HR is {abs(hr_day_delta):.0f} bpm {direction} than the mean of the day")
channel_idxs.append(hr_idx)

distance_mean = self._metric_mean(row, distance_idx, start, end)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we split this up in two functions for readibility

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in df174d1

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored and put into parent in 5cb38ef

Comment thread synthesizers/cardio.py Outdated
Comment thread synthesizers/cardio.py
Comment thread synthesizers/sleep.py
Comment thread synthesizers/stationary_activity.py Outdated
Comment thread synthesizers/stationary_activity.py Outdated
Copy link
Copy Markdown
Collaborator

@max-rosenblattl max-rosenblattl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @KarlDeck 🚀 I think the functionality is quite there. But a few more structural changes necessary

Comment thread synthesizers/__init__.py
return float(np.sum(values))

@classmethod
def _metric_day_mean_delta(cls, row: Recording, idx: int | None, start: int, end: int) -> float | None:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a question: should we exclude the window mean from the daily mean? probably not though

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I don't think so. But something I've been wanting to do for a while now is to calculate the Resting HR for the person and then the RR because that's a very informative medical measurement. But this might include calculating the resting HR over multiple days to be more accurate. So either we do multi day values or just do it for one day which is an approximation. For the weekly data we can definitely do it. Even for one time series

Comment thread synthesizers/__init__.py
from timef.schema import Annotation, Recording


class CrossChannelSynthesizer(abc.ABC):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the all the helper methods in __init__.py are better suited in a _helper.py

from util import seed_from_key


class StationaryActivitySynthesizer(CrossChannelSynthesizer):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should consider having a base WorkoutSynthesizer/ActivitySynthesizer, because Stationary and Cardio are almost duplicating 80% of the code; a base class for these too makes a lot of sense, also thinking about the other activities/workouts, where we would replicate a lot of code

)
return results

def _metrics_suffix(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the metrics suffix is deterministic. I think we should adapt the template based behaviour from the other captions for greater variety. I'm not sure what's the best format to encode it but something along the lines:

  "cross_channel": {
    "metrics": {                                                                                    
      "hr_summary": [                         
        "averaging a {name} of {mean:.0f} {unit}",                                                  
        "with a mean {name} of {mean:.0f} {unit}",
        "at roughly {mean:.0f} {unit} {name}"                                                       
      ],                                      
      "hr_peak": ["peaking at {peak:.0f} {unit}", ...],                                             
      "hr_elevated": [...],                                                                         
      "hr_day_delta_higher": [...],                                                                 
      "hr_day_delta_lower": [...],                                                                  
      "distance_mean": [...],                                                                       
      "distance_total": [...],                                                                      
      "step_mean": [...],                                                                           
      "step_total": [...]                                                                           
    }             
  } 

Also, important: currently I do see the cross_channel captions as more sophisticated semantic captions. Therefor, I would make them mutually exclusive (we don't have to handle that in code). But it requires us to have cross_channel captions for each activity, @KarlDeck did you find anything that would speak against extending them (not in this PR, a new one though)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants