[LFXV2-1701] fix: downgrade health check poll logs from error to warn by andrest50 · Pull Request #58 · linuxfoundation/lfx-v2-indexer-service

andrest50 · 2026-05-07T19:05:21Z

Summary

Downgrade NATS health check log messages from Error to Warn in messaging_repository.go — these fire on every K8s probe poll (~10s) during a degraded state, flooding logs with repeated errors
Rename and downgrade OpenSearch health check log messages from Error to Warn in storage_repository.go, adding "OpenSearch" context to the message so the failing dependency is unambiguous
One-time lifecycle events (e.g. NATS connection permanently closed - max reconnects exhausted) remain at Error since they fire once and identify the root cause

Ticket

🤖 Generated with Claude Code

Health check failures logged at ERROR on every K8s probe poll (~10s) flooded logs during degraded state. Downgraded to WARN and renamed the OpenSearch log messages to include the dependency name for clarity. One-time lifecycle events (e.g. NATS permanently closed) remain ERROR. Generated with [Claude Code](https://claude.ai/code) Signed-off-by: Andres Tobon <andrest2455@gmail.com>

Copilot

Pull request overview

This PR reduces noisy error-level logging from periodic dependency health checks (NATS and OpenSearch) so degraded states don’t flood logs during frequent Kubernetes probe polling, while preserving error logs for one-time lifecycle events.

Changes:

Downgrade NATS health check failure logs from Error to Warn.
Rename/downgrade OpenSearch health check logs to Warn and add explicit “OpenSearch” context in the messages.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
internal/infrastructure/storage/storage_repository.go	Downgrades OpenSearch health check failure logs to `Warn` and clarifies messages with “OpenSearch” context.
internal/infrastructure/messaging/messaging_repository.go	Downgrades NATS health check failure logs to `Warn` to reduce repeated error spam during outages.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coderabbitai · 2026-05-07T19:07:15Z

Walkthrough

Two infrastructure health-check methods were modified: MessagingRepository.HealthCheck and StorageRepository.HealthCheck. Both now log warnings instead of errors for transient failures while maintaining identical error return behavior. The messaging implementation also refactors error message formatting to use the health-check error constant.

Changes

Health-Check Error Handling

Layer / File(s)	Summary
Health-Check Error Logging `internal/infrastructure/messaging/messaging_repository.go`, `internal/infrastructure/storage/storage_repository.go`	Messaging and storage health-check failures log at warning level instead of error level. Messaging also refactors error formatting to use `fmt.Errorf("%s: ...", constants.ErrHealthCheck, ...)`. Error return values remain unchanged.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately captures the main change: downgrading health check poll logs from error to warning level.
Description check	✅ Passed	The description clearly explains the rationale for downgrading logs, provides specific details about which components are affected, and distinguishes between recurring health checks and one-time lifecycle events.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch atobon/LFXV2-1701-reduce-health-check-log-noise

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

internal/infrastructure/messaging/messaging_repository.go (1)
463-481: 💤 Low value

LGTM — log-level downgrade and error format alignment look correct.

All three failure paths consistently emit Warn instead of Error, and the returned errors now uniformly use constants.ErrHealthCheck as a prefix, matching the pattern already established in storage_repository.go. No logic changes were made.

One pre-existing observation not introduced here: the check at Line 476 (r.conn.Status() != nats.CONNECTED) is logically equivalent to !r.conn.IsConnected() in the NATS client (both read the same internal status field). The only scenario where Line 476 fires but Line 470 did not is a concurrent state change between the two reads (TOCTOU window). Consider collapsing the two checks into one for clarity if the race defence is not intentional.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/infrastructure/messaging/messaging_repository.go` around lines 463 -
481, The health-check contains two equivalent status checks
(r.conn.IsConnected() and r.conn.Status() != nats.CONNECTED) which can race and
are redundant; consolidate them into a single check using r.conn.IsConnected()
(or r.conn.Status() directly) and keep the diagnostic r.logger.Debug call and
the Warn + fmt.Errorf(constants.ErrHealthCheck...) behavior, removing the
duplicate branch that checks r.conn.Status() != nats.CONNECTED and preserving
its more detailed warning message (status and expected) in the remaining check
so the logic is clear and TOCTOU windows are avoided; reference
r.conn.IsConnected(), r.conn.Status(), and nats.CONNECTED when making the
change.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@internal/infrastructure/messaging/messaging_repository.go`:
- Around line 463-481: The health-check contains two equivalent status checks
(r.conn.IsConnected() and r.conn.Status() != nats.CONNECTED) which can race and
are redundant; consolidate them into a single check using r.conn.IsConnected()
(or r.conn.Status() directly) and keep the diagnostic r.logger.Debug call and
the Warn + fmt.Errorf(constants.ErrHealthCheck...) behavior, removing the
duplicate branch that checks r.conn.Status() != nats.CONNECTED and preserving
its more detailed warning message (status and expected) in the remaining check
so the logic is clear and TOCTOU windows are avoided; reference
r.conn.IsConnected(), r.conn.Status(), and nats.CONNECTED when making the
change.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 35b73a2c-a38b-44de-b174-786c904af71e

📥 Commits

Reviewing files that changed from the base of the PR and between 13040f5 and 5e10a79.

📒 Files selected for processing (2)

internal/infrastructure/messaging/messaging_repository.go
internal/infrastructure/storage/storage_repository.go

Copilot AI review requested due to automatic review settings May 7, 2026 19:05

andrest50 requested a review from a team as a code owner May 7, 2026 19:05

Copilot started reviewing on behalf of andrest50 May 7, 2026 19:05 View session

Copilot AI reviewed May 7, 2026

View reviewed changes

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

jordane approved these changes May 7, 2026

View reviewed changes

andrest50 merged commit b24e82e into main May 7, 2026
13 checks passed

andrest50 deleted the atobon/LFXV2-1701-reduce-health-check-log-noise branch May 7, 2026 19:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LFXV2-1701] fix: downgrade health check poll logs from error to warn#58

[LFXV2-1701] fix: downgrade health check poll logs from error to warn#58
andrest50 merged 1 commit into
mainfrom
atobon/LFXV2-1701-reduce-health-check-log-noise

andrest50 commented May 7, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

coderabbitai Bot commented May 7, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

andrest50 commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Ticket

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

coderabbitai Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

andrest50 commented May 7, 2026 •

edited

Loading

coderabbitai Bot commented May 7, 2026 •

edited

Loading