Enh/fallback allowable values simpleupdate#846
Conversation
📝 WalkthroughWalkthroughThis PR introduces fallback transfer protocol support for firmware updates. It adds optional Changes
Sequence DiagramsequenceDiagram
participant Controller as Version Controller
participant BMC as BMC Client
participant Redfish as Redfish UpdateService
participant Resolve as Protocol Resolver
Controller->>BMC: GetSupportedTransferProtocols()
BMC->>Redfish: Fetch SimpleUpdate.AllowableValues
Redfish-->>BMC: [HTTPS, NFS, CIFS, TFTP]
BMC-->>Controller: supported protocols
Controller->>Resolve: resolveTransferProtocol(imageSpec, supported)
alt Primary protocol supported
Resolve-->>Controller: (primary_protocol, primary_uri, nil)
else Primary unsupported, fallback configured
Resolve-->>Controller: (fallback_protocol, fallback_uri, nil)
Controller->>Controller: log fallback usage
else Neither supported
Resolve-->>Controller: (empty, empty, validation_error)
Controller->>Controller: patch status Failed, return error
end
alt Validation success
Controller->>BMC: UpgradeVersion(resolved_protocol, resolved_uri)
BMC-->>Controller: success
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (1)
internal/controller/biosversion_controller_test.go (1)
451-480: Strengthen fallback test so it proves fallback was actually used.Line 460 and Line 463 currently use equivalent successful URI payloads, so the test can still pass even if fallback resolution regresses. Consider making the primary URI intentionally invalid for the mock upgrade path while keeping fallback URI valid.
💡 Suggested test hardening
- URI: upgradeServerBiosVersion, + URI: "http://primary.invalid/not-a-valid-upgrade-image", TransferProtocol: "HTTP", // NOT supported by mock BMC FallbackTransferProtocol: "HTTPS", // Supported by mock BMC FallbackURI: upgradeServerBiosVersion,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/biosversion_controller_test.go` around lines 451 - 480, The test creates a BIOSVersion (BIOSVersionSpec -> BIOSVersionTemplate -> ImageSpec) with TransferProtocol "HTTP" and FallbackTransferProtocol "HTTPS" but both URI and FallbackURI point to the same successful payload so the fallback isn't actually exercised; change the primary URI (the ImageSpec.URI used with TransferProtocol "HTTP") to an intentionally invalid/unreachable value that will fail the mock BMC upgrade path while keeping ImageSpec.FallbackURI pointing to the valid upgradeServerBiosVersion; this ensures the controller must attempt the primary, fail, then use the fallback and the Eventually checks on BIOSVersion.Status.State (InProgress -> Completed) truly validate fallback behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/v1alpha1/biosversion_types.go`:
- Around line 68-69: The CEL XValidation rules must reject empty fallback
values, not just presence: update the rules referencing fallbackTransferProtocol
and fallbackURI so they require non-empty strings (e.g. replace
has(self.fallbackURI) and has(self.fallbackTransferProtocol) with checks that
include size(...) > 0 or equivalent), and ensure the first rule also enforces
transferProtocol is non-empty when fallbackTransferProtocol is present; apply
the same tightening to the other occurrence(s) noted (the block around the
symbols fallbackTransferProtocol, fallbackURI, and transferProtocol).
In `@bmc/redfish_local.go`:
- Around line 98-131: The UnitTestMockUps shared state (fields like
BIOSUpgradeTaskIndex, BIOSUpgradingVersion, BIOSUpgradeTaskStatus,
BIOSUpgradeTaskFailedStatus, BIOSVersion and similar BMC* fields) is accessed
concurrently in UpgradeBiosVersion and GetBiosUpgradeTask (and the other
goroutine-based upgrade handlers), causing data races; add synchronization by
embedding a mutex in the UnitTestMockUps struct (or use atomics for the numeric
indices and a mutex for maps/version fields), then wrap every read/write of
those fields inside the mutex (Lock/Unlock) or use atomic ops for
BIOSUpgradeTaskIndex/BMCUpgradeTaskIndex; update UpgradeBiosVersion,
GetBiosUpgradeTask and the equivalent BMC upgrade functions to acquire the lock
before reading/writing UnitTestMockUps fields and release it after.
In `@config/crd/bases/metal.ironcore.dev_biosversions.yaml`:
- Around line 110-117: The current CEL x-kubernetes-validations in the ImageSpec
allow empty strings because they only check presence; update the ImageSpec
XValidation rules for fallbackTransferProtocol and fallbackURI to require
non-empty values when present (e.g. replace has(...) checks with has(...) &&
size(self.fallbackTransferProtocol) > 0 and similarly size(self.fallbackURI) >
0) and update the paired-rule to require both are set and non-empty or both
unset, then regenerate the CRD with the project manifest generation (run make
manifests) so the changes propagate into the auto-generated CRD YAML.
In `@config/crd/bases/metal.ironcore.dev_bmcversions.yaml`:
- Around line 125-132: The current CRD validation allows empty fallback values;
update the API marker validations on the BMC version Go type (add size(...) > 0
checks for both fallbackTransferProtocol and fallbackURI) so the generated
x-kubernetes-validations require non-empty strings when those fields are present
(e.g., change the second rule to check size(self.fallbackURI) > 0 and the
pairing rule to include size(...) > 0 for both fields), then run make manifests
to regenerate the config/crd/bases/*.yaml rather than editing the generated YAML
directly.
In `@internal/controller/bmcversion_controller.go`:
- Around line 298-313: The code currently calls
handleUpgradeInProgressState(...) first which can issue the upgrade task and
only afterward calls resetBMC(..., ConditionResetIssued), allowing a pre-upgrade
reset (GracefulRestartBMC) to be scheduled after the upgrade start; move the
pre-upgrade reset ahead of issuance by invoking resetBMC(ctx, bmcVersion,
bmcObj, ConditionResetIssued) and ensuring it succeeds (returning errors/result
as before) before calling handleUpgradeInProgressState; alternatively, if you
need early protocol validation keep protocol checks separate from the actual
issuance inside handleUpgradeInProgressState and only call the function that
performs the SimpleUpdate after resetBMC succeeds.
In `@internal/controller/helper.go`:
- Around line 637-650: The function currently compares image.TransferProtocol
and image.FallbackTransferProtocol to supportedProtocols case-insensitively but
returns the user's original casing (image.TransferProtocol /
image.FallbackTransferProtocol); change the returns to return the matching
element from supportedProtocols (the local variable used in the loop) so the
canonical/advertised protocol casing is returned (e.g., return supported,
image.URI, nil and return supported, image.FallbackURI, nil); locate the logic
around image.TransferProtocol, image.FallbackTransferProtocol, and
supportedProtocols in the helper.go function to update both the primary and
fallback return paths.
- Around line 36-37: The literals RetryOfFailedResourceConditionIssued and
RetryOfFailedResourceReasonIssued are old pre-migration names causing mismatch
with the canonical condition/reason (used by BMCVersionReconciler and
handleRetryAnnotationPropagation()), so replace these literal constants with the
canonical constant names (use the project’s canonical
ConditionRetryOfFailedResourceIssued and ReasonRetryOfFailedResourceIssued
equivalents) wherever they are declared/used (including the occurrences around
lines 36 and the block at 382–389) so the reconciler and
handleRetryAnnotationPropagation() operate on the same canonical names.
---
Nitpick comments:
In `@internal/controller/biosversion_controller_test.go`:
- Around line 451-480: The test creates a BIOSVersion (BIOSVersionSpec ->
BIOSVersionTemplate -> ImageSpec) with TransferProtocol "HTTP" and
FallbackTransferProtocol "HTTPS" but both URI and FallbackURI point to the same
successful payload so the fallback isn't actually exercised; change the primary
URI (the ImageSpec.URI used with TransferProtocol "HTTP") to an intentionally
invalid/unreachable value that will fail the mock BMC upgrade path while keeping
ImageSpec.FallbackURI pointing to the valid upgradeServerBiosVersion; this
ensures the controller must attempt the primary, fail, then use the fallback and
the Eventually checks on BIOSVersion.Status.State (InProgress -> Completed)
truly validate fallback behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 80892058-5652-4fbc-bb55-529e4c46efe1
📒 Files selected for processing (20)
api/v1alpha1/biosversion_types.gobmc/bmc.gobmc/mockup.gobmc/oem_helpers.gobmc/redfish.gobmc/redfish_kube.gobmc/redfish_local.goconfig/crd/bases/metal.ironcore.dev_biosversions.yamlconfig/crd/bases/metal.ironcore.dev_biosversionsets.yamlconfig/crd/bases/metal.ironcore.dev_bmcversions.yamlconfig/crd/bases/metal.ironcore.dev_bmcversionsets.yamldocs/api-reference/api.mdinternal/controller/biossettingsset_controller_test.gointernal/controller/biosversion_controller.gointernal/controller/biosversion_controller_test.gointernal/controller/bmcsettings_controller_test.gointernal/controller/bmcsettingsset_controller_test.gointernal/controller/bmcversion_controller.gointernal/controller/bmcversion_controller_test.gointernal/controller/helper.go
💤 Files with no reviewable changes (1)
- internal/controller/bmcsettingsset_controller_test.go
36ea10d to
d6c2fca
Compare
Implements transfer protocol fallback mechanism for BMC and BIOS firmware updates when primary protocol is not supported by BMC. Changes: - Add TransferProtocol, FallbackTransferProtocol, FallbackURI to ImageSpec - Add XValidation rules for fallback field consistency - Implement GetSupportedTransferProtocols() in BMC interface - Add protocol validation with case-insensitive matching - Add resolveTransferProtocol() helper function - Integrate protocol resolution in BMC/BIOS controllers - Extend 13 tests with protocol fields + add 2 fallback tests The implementation automatically falls back to alternate protocol+URI when BMC reports primary protocol as unsupported. Signed-off-by: Konstantin Koslowski <konstantin.koslowski@sap.com>
Signed-off-by: Konstantin Koslowski <konstantin.koslowski@sap.com>
d6c2fca to
21c8907
Compare
- Remove duplicate RetryOfFailedResource* and ServerMaintenance* constants from helper.go; use canonical ones from conditions.go everywhere - Fix bmcversion_controller.go to use ConditionRetryOfFailedResourceIssued and ReasonRetryOfFailedResourceIssued (correct short-form values) - Tighten CEL XValidation rule for ImageSpec: require size() > 0 on both fallbackTransferProtocol and fallbackURI, not just has() - Fix resolveTransferProtocol to return the canonical casing from the BMC's AllowableValues list instead of the user-supplied casing - Revert bmcversion_controller.go upgrade state order: resetBMC must run before handleUpgradeInProgressState to avoid nil UpgradeTask dereference on next reconcile Signed-off-by: Konstantin Koslowski <konstantin.koslowski@sap.com>
21c8907 to
add18cd
Compare
|
do we happen to know the reason behind this change? Can not find issue or a description. |
|
@coderabbitai what do you think is most k8 native way to solve the problem this PR is addressing? ( lets say 90% of the server supported all protocol, 10% support subset of it. Hence the need of fallback implemented here) Does it make sense to implement it this way? |
|
Tip For best results, initiate chat on the files or code changes.
Approach 1 — Current PR: Fallback fields in
|
Fallback transfer protocol for firmware updates
Some BMC implementations only support a subset of transfer protocols for the Redfish
SimpleUpdateaction. When a firmware image URI is configured with a protocol the BMC doesn't support, the update would silently fail or error with no clear recovery path.This PR adds a fallback mechanism to
ImageSpec: alongside the existingtransferProtocolandurifields, operators can now configurefallbackTransferProtocolandfallbackURI. Before issuing a firmware update, the controller queries the BMC'sUpdateServicefor itsTransferProtocol@Redfish.AllowableValueslist and selects which protocol to use. The BMC then fetches the firmware image directly from the resolved URI.Protocol resolution:
This applies to both
BMCVersionandBIOSVersioncontrollers. CEL validation rules enforce thatfallbackTransferProtocolandfallbackURIare always set together and non-empty, and that a fallback requires a non-empty primary protocol to be set.Example
The fallback fields are optional — if omitted, the controller behaves exactly as before.