Cleanup stale metrics and event subscriptions#889
Conversation
41bec4e to
1892ef5
Compare
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
🚧 Files skipped from review as they are similar to previous changes (4)
📝 WalkthroughWalkthroughAdds BMC.GetEventSubscription and a Redfish implementation that treats HTTP 404 as “not present”; controller now verifies stored metrics/events subscription links, clears stale status links, and recreates missing subscriptions; mock-server helper and integration test simulate external deletion and verify reconciliation. ChangesEvent Subscription Staleness Detection
Sequence DiagramsequenceDiagram
participant Controller as BMCReconciler
participant BMCClient as RedfishBaseBMC
participant Redfish as RedfishEventService
participant Mock as MockServer
Controller->>BMCClient: GetEventSubscription(ctx, metricsURI)
BMCClient->>Redfish: ev.GetEventSubscription(metricsURI)
Redfish-->>BMCClient: 404 (not found) / subscription object
BMCClient-->>Controller: (false,nil) or (true,nil)
alt subscription not found
Controller->>Controller: clear status link (status patch)
Controller->>BMCClient: create subscription
BMCClient->>Redfish: CreateEventSubscription(...)
Redfish-->>BMCClient: new subscription URI
BMCClient-->>Controller: new URI
else subscription exists
Controller-->>Controller: keep status link
end
Note over Mock,Redfish: MockServer.DeleteSubscription simulates external deletion by removing the member from the subscription collection
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
xkonni
left a comment
There was a problem hiding this comment.
good, fixes issue, minor optimizations possible.
1892ef5 to
5c3f881
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@bmc/mock/server/server.go`:
- Around line 1313-1321: The code can silently overwrite an existing collection
with an empty one when type assertions fail or dataFS.ReadFile/json.Unmarshal
fail; update the loading logic around the variables col, cached, Collection,
dataFS.ReadFile, json.Unmarshal and collectionKey so that: (1) when hasOverride
is true but cached is not a Collection, do NOT replace col (leave original
cached value or bail); (2) when ReadFile returns an error or json.Unmarshal
fails, do not overwrite col with the empty default—propagate/handle the error or
keep the prior value; and (3) before the persistence step that writes col (the
save at the persistence call around collection handling), add a guard to only
persist when col is valid (e.g., col.ID is non-empty or len(col.Members) > 0);
if invalid, skip saving and return/log an error so you cannot accidentally wipe
members.
In `@internal/controller/bmc_controller.go`:
- Around line 567-580: Update the error string in the patch failure to reference
the BMC status rather than "server status": when handling
bmcObj.Status.MetricsReportSubscriptionLink and calling r.Status().Patch(ctx,
bmcObj, client.MergeFrom(bmcBase)), change the fmt.Errorf message used on error
return to something like "failed to patch BMC status to clear stale metrics
subscription link: %w" so it correctly references bmcObj (BMC) and its status.
- Around line 595-608: The error message used when the status patch fails
incorrectly references "server status"; update the fmt.Errorf call in the block
handling bmcObj.Status.EventsSubscriptionLink (around GetEventSubscription and
r.Status().Patch) to say "failed to patch BMC status to clear stale events
subscription link" so it correctly names the BMC status and use the existing
variables (bmcObj, bmcBase) and error wrapping as-is.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 11823367-8cf1-4fb0-bc23-9df7749fba1a
📒 Files selected for processing (5)
bmc/bmc.gobmc/mock/server/server.gobmc/redfish.gointernal/controller/bmc_controller.gointernal/controller/bmc_controller_test.go
Signed-off-by: Alan Sergeant <alan.sergeant@sap.com>
5c3f881 to
bfda8c6
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
| if err := r.Status().Patch(ctx, bmcObj, client.MergeFrom(bmcBase)); err != nil { | ||
| return false, fmt.Errorf("failed to patch BMC status to clear stale metrics subscription link: %w", err) | ||
| } |
There was a problem hiding this comment.
I would just in memory clear the link but not patch the resource as we are doing later anyway. This will reduce the API calls.
| if err := r.Status().Patch(ctx, bmcObj, client.MergeFrom(bmcBase)); err != nil { | ||
| return false, fmt.Errorf("failed to patch BMC status to clear stale events subscription link: %w", err) | ||
| } |
There was a problem hiding this comment.
I would just in memory clear the link but not patch the resource as we are doing later anyway. This will reduce the API calls.
| if err != nil { | ||
| return false, fmt.Errorf("failed to check metrics report subscription for BMC %s (%s): %w", bmcObj.Name, bmcObj.Status.IP, err) | ||
| } |
There was a problem hiding this comment.
How about logging this error e.g. in case we run into a transient error? Otherwise the subscription logic will block the BMC reconciliation.
| exists, err := bmcClient.GetEventSubscription(ctx, bmcObj.Status.EventsSubscriptionLink) | ||
| if err != nil { | ||
| return false, fmt.Errorf("failed to check events subscription for BMC %s (%s): %w", bmcObj.Name, bmcObj.Status.IP, err) | ||
| } |
There was a problem hiding this comment.
How about logging this error e.g. in case we run into a transient error? Otherwise the subscription logic will block the BMC reconciliation.
Fixes #886
Summary by CodeRabbit
New Features
Bug Fixes
Tests