[log] log: add debug logging to health monitor#3265
Conversation
Add 6 meaningful debug log calls to internal/launcher/health_monitor.go using the existing logHealth logger (launcher:health namespace): - NewHealthMonitor: log creation with interval and max restart failures - Stop: log when stop is initiated (before blocking on doneCh) - run: log goroutine startup with interval - checkAll: log total servers being checked each cycle - checkAll: log when a recovered server's failure counter is reset - handleErrorState: log when max failures reached and restart is skipped These additions improve troubleshooting of health monitoring behavior during development and production incident investigation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds debug-level logging to the launcher health monitor to improve observability during periodic health checks and server auto-restart behavior.
Changes:
- Log health monitor creation with configured interval and restart-failure threshold.
- Add debug logs around lifecycle events (stop initiation, goroutine start).
- Add debug logs during health-check cycles (server count, recovery/reset, skip-restart at max failures).
Show a summary per file
| File | Description |
|---|---|
| internal/launcher/health_monitor.go | Adds launcher:health debug logs for monitor lifecycle and health-check activity. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 1/1 changed files
- Comments generated: 1
| if failures >= maxConsecutiveRestartFailures { | ||
| // Already logged when the threshold was reached; stay silent. | ||
| logHealth.Printf("Skipping restart for serverID=%s: max failures reached (%d/%d)", serverID, failures, maxConsecutiveRestartFailures) | ||
| return |
There was a problem hiding this comment.
The failures >= maxConsecutiveRestartFailures branch is documented as “stay silent” and the PR description says the restart is “silently skipped”, but this adds a debug log that will fire on every health-check tick for a permanently-failed server (potential log spam when DEBUG enables launcher:health). Consider removing this log, or only logging once at the moment the threshold is reached (and keep the comment/PR description consistent).
Summary
Adds 6 meaningful debug log calls to
internal/launcher/health_monitor.gousing the existinglogHealthlogger (launcher:healthnamespace).Changes
File modified:
internal/launcher/health_monitor.goNewHealthMonitorStopdoneCh)runcheckAllcheckAllhandleErrorStateWhy These Additions?
The health monitor is a critical background component that manages server recovery. Without debug logging, it's difficult to answer questions like:
Quality Checklist
logHealthlogger (no new declaration needed)pkg:filenameconvention (launcher:health)