-
-
Notifications
You must be signed in to change notification settings - Fork 1
Karl deck/spike drop redesign #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
ea9f84c
8910497
14b778f
dda223c
afa4deb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1,5 @@ | ||
| __pycache__/ | ||
| data/ | ||
| .env | ||
| .claude/ | ||
| docs/ | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -7,163 +7,162 @@ | |||||||
| from __future__ import annotations | ||||||||
|
|
||||||||
| import numpy as np | ||||||||
| from scipy.signal import find_peaks, peak_widths | ||||||||
| from scipy.signal import find_peaks | ||||||||
|
|
||||||||
| from detectors import DetectionResult, StructuralDetector | ||||||||
|
|
||||||||
|
|
||||||||
| class SpikeDetector(StructuralDetector): | ||||||||
| """Detects spikes and drops. | ||||||||
| """ | ||||||||
| """Detects upward point anomalies using a local-baseline heuristic.""" | ||||||||
|
|
||||||||
| def __init__( | ||||||||
| self, | ||||||||
| filter_zeros: bool = False, | ||||||||
| prominence_scale: float = 3.0, | ||||||||
| min_prominence: float = 0.0, | ||||||||
| min_distance: int = 1, | ||||||||
| min_width: int = 1, | ||||||||
| top_k: int | None = None, | ||||||||
| smooth_window: int = 1, | ||||||||
| drop_localization: str = "center", | ||||||||
| max_cluster_peaks: int = 2, | ||||||||
| soft_top_k_margin: float = 0.90, | ||||||||
| ) -> None: | ||||||||
| super().__init__(filter_zeros=filter_zeros) | ||||||||
| self.prominence_scale = prominence_scale | ||||||||
| self.min_prominence = min_prominence | ||||||||
| self.min_distance = max(1, min_distance) | ||||||||
| self.min_width = max(1, min_width) | ||||||||
| self.top_k = top_k | ||||||||
| self.smooth_window = max(1, smooth_window) | ||||||||
| self.drop_localization = drop_localization | ||||||||
| self.max_cluster_peaks = max(1, max_cluster_peaks) | ||||||||
| self.soft_top_k_margin = min(max(float(soft_top_k_margin), 0.0), 1.0) | ||||||||
| self.duplicate_radius = max(1, min(3, self.min_distance // 5 if self.min_distance > 1 else 1)) | ||||||||
|
|
||||||||
| def _detect(self, filtered_signal: np.ndarray, indices: np.ndarray) -> list[DetectionResult]: | ||||||||
| smoothed_signal = self._smooth(filtered_signal) | ||||||||
| prominence = self._prominence_threshold(filtered_signal) # threshold on raw/filtered, not smoothed | ||||||||
| if prominence <= 0: | ||||||||
| return [] | ||||||||
|
|
||||||||
| results: list[DetectionResult] = [] | ||||||||
| seen_minutes: set[int] = set() | ||||||||
|
|
||||||||
| for peak_idx, peak_score, properties in self._select_peaks(smoothed_signal, prominence): | ||||||||
| minute = self._localize_spike(peak_idx, properties, filtered_signal, indices) | ||||||||
| if minute in seen_minutes: | ||||||||
| continue | ||||||||
| results.append(DetectionResult(event_type="spike", spike_minute=minute, score=peak_score)) | ||||||||
| seen_minutes.add(minute) | ||||||||
|
|
||||||||
| for peak_idx, peak_score, properties in self._select_drop_peaks(smoothed_signal, prominence): | ||||||||
| minute = self._localize_drop(peak_idx, properties, smoothed_signal, indices) | ||||||||
| if minute in seen_minutes: | ||||||||
| continue | ||||||||
| results.append(DetectionResult(event_type="drop", spike_minute=minute, score=peak_score)) | ||||||||
| seen_minutes.add(minute) | ||||||||
|
|
||||||||
| candidates = self._collect_candidates(filtered_signal, indices) | ||||||||
| results = self._select_results(candidates) | ||||||||
| results.sort(key=lambda result: int(result.spike_minute)) | ||||||||
| return results | ||||||||
|
|
||||||||
| def _smooth(self, filtered_signal: np.ndarray) -> np.ndarray: | ||||||||
| if self.smooth_window <= 1: | ||||||||
| return filtered_signal | ||||||||
|
|
||||||||
| kernel = np.ones(self.smooth_window, dtype=float) / float(self.smooth_window) | ||||||||
| return np.convolve(filtered_signal, kernel, mode="same") | ||||||||
|
|
||||||||
| def _select_peaks( | ||||||||
| def _collect_candidates( | ||||||||
| self, | ||||||||
| smoothed_signal: np.ndarray, | ||||||||
| prominence: float, | ||||||||
| ) -> list[tuple[int, float, dict[str, int]]]: | ||||||||
| filtered_signal: np.ndarray, | ||||||||
| indices: np.ndarray, | ||||||||
| ) -> list[dict[str, float | int]]: | ||||||||
| peaks, properties = find_peaks( | ||||||||
| smoothed_signal, | ||||||||
| prominence=prominence, | ||||||||
| distance=self.min_distance, | ||||||||
| filtered_signal, | ||||||||
| prominence=self.min_prominence, | ||||||||
|
||||||||
| prominence=self.min_prominence, | |
| prominence=self.min_prominence, | |
| distance=self.min_distance, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe solve by rename
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
scipy find_peaks distance parameter behavior
💡 Result:
In scipy.signal.find_peaks, the distance parameter enforces a minimum horizontal spacing (in samples) between returned peak indices:
distanceis the required minimal horizontal distance (>= 1) in samples between neighbouring peaks. If candidate peaks are closer thandistance, the smaller peaks are removed first until all remaining peaks satisfy the spacing constraint. [1]- Conceptually, SciPy’s implementation keeps the taller peaks preferentially and suppresses nearby smaller ones (i.e., it effectively processes peaks from high to low and removes neighbors within the forbidden window). [2]
- Guarantee is on peak indices being at least
distancesamples apart; however, for flat/plateau peaks, the edges of plateaus may end up closer thandistanceeven though the returned peak indices respect it. [1]
Sources: SciPy docs for find_peaks [1] and an explanation/derivation based on the SciPy implementation [2].
🏁 Script executed:
find . -name "spike.py" -path "*/detectors/*" | head -5Repository: StanfordBDHG/SensorTSLM
Length of output: 87
🏁 Script executed:
cat -n ./detectors/spike.py | sed -n '1,60p'Repository: StanfordBDHG/SensorTSLM
Length of output: 2749
🏁 Script executed:
cat -n ./detectors/spike.py | sed -n '43,100p'Repository: StanfordBDHG/SensorTSLM
Length of output: 3171
🏁 Script executed:
cat -n ./detectors/spike.py | sed -n '100,180p'Repository: StanfordBDHG/SensorTSLM
Length of output: 4015
min_distance parameter is not applied during peak finding.
The min_distance attribute is stored in the constructor (line 30) but is not passed to find_peaks() (line 48). This means the minimum spacing constraint is not enforced during candidate collection. The duplicate_radius clustering in _select_results handles deduplication of collected candidates but doesn't replace the distance parameter's role in actively suppressing weaker peaks that fall within the minimum distance during the initial peak detection phase.
Pass distance=self.min_distance to enforce spacing during peak finding:
🔧 Proposed fix
peaks, properties = find_peaks(
filtered_signal,
prominence=self.min_prominence,
width=self.min_width,
+ distance=self.min_distance,
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@detectors/spike.py` around lines 48 - 54, The peak-finding call does not
apply the stored min_distance constraint, so update the find_peaks invocation
that processes filtered_signal (where peaks, properties = find_peaks(...)) to
include distance=self.min_distance so spacing is enforced during candidate
collection; keep the existing duplicate_radius logic in _select_results for
clustering but ensure initial suppression uses the min_distance attribute.
Uh oh!
There was an error while loading. Please reload this page.