Skip to content

fix(maxprocs): gracefully handle cgroup permission errors#7659

Closed
SAY-5 wants to merge 2 commits intokedacore:mainfrom
SAY-5:fix-maxprocs-permission-7653
Closed

fix(maxprocs): gracefully handle cgroup permission errors#7659
SAY-5 wants to merge 2 commits intokedacore:mainfrom
SAY-5:fix-maxprocs-permission-7653

Conversation

@SAY-5
Copy link
Copy Markdown

@SAY-5 SAY-5 commented Apr 17, 2026

What does this PR do?

Fixes #7653.

When KEDA pods run in an environment where the cgroup CPU files are not readable (for example, under a restricted SecurityContext that denies access to /sys/fs/cgroup/cpu.max, or on certain managed Kubernetes offerings that mount cgroups read-only), go.uber.org/automaxprocs's maxprocs.Set returns a permission error:

failed to set max procs
 {"error": "open /sys/fs/cgroup/cpu.max: permission denied"}

Today all three callers — cmd/operator/main.go, cmd/webhooks/main.go, and cmd/adapter/main.go — treat that error as fatal and call os.Exit(1). The result is a CrashLoopBackOff with no way to bring KEDA up at all, even though the only consequence of the failure is that GOMAXPROCS cannot be derived from cgroup quota. Go's runtime default (runtime.NumCPU()) is already a safe fallback.

Fix

pkg/util/maxprocs.go now:

  1. Inspects the error returned by maxprocs.Set.
  2. If it is an fs.ErrPermission, logs a warning via the same klog.Logger (so it is still visible in pod logs) and returns nil.
  3. Otherwise propagates the error unchanged, preserving today's behavior for genuinely unexpected failures.

This lets KEDA start in locked-down environments while still surfacing the misconfiguration in logs. Non-permission errors (malformed cgroup data, I/O errors, etc.) remain fatal as before.

Which issue(s) this PR fixes

Fixes #7653

Checklist

  • Commits are signed with Developer Certificate of Origin (DCO)
  • A changelog entry was added under Unreleased → Fixes
  • Change is trivial enough to not require unit tests (single call-site, behavior change is "convert permission error → warning")

SAY-5 added 2 commits April 16, 2026 23:08
When KEDA pods run in environments where cgroup files are not readable
(for example, a restricted SecurityContext or a non-standard cgroup
mount), go.uber.org/automaxprocs returns a permission error from
maxprocs.Set. Today all three callers (operator, webhooks, adapter)
treat that as fatal and os.Exit(1), producing a CrashLoopBackOff with
no way to start KEDA at all.

This change handles fs.ErrPermission in ConfigureMaxProcs by logging a
warning and returning nil. GOMAXPROCS is already left at the Go
runtime default (NumCPU) when maxprocs.Set fails, so the process can
continue to start and serve traffic. Non-permission errors are still
propagated unchanged.

Fixes #7653

Signed-off-by: SAY-5 <SAY-5@users.noreply.github.com>
Signed-off-by: SAY-5 <SAY-5@users.noreply.github.com>
@SAY-5 SAY-5 requested a review from a team as a code owner April 17, 2026 06:09
@github-actions
Copy link
Copy Markdown

Thank you for your contribution! 🙏

Please understand that we will do our best to review your PR and give you feedback as soon as possible, but please bear with us if it takes a little longer as expected.

While you are waiting, make sure to:

  • Add an entry in our changelog in alphabetical order and link related issue
  • Update the documentation, if needed
  • Add unit & e2e tests for your changes
  • GitHub checks are passing
  • Is the DCO check failing? Here is how you can fix DCO issues

Once the initial tests are successful, a KEDA member will ensure that the e2e tests are run. Once the e2e tests have been successfully completed, the PR may be merged at a later date. Please be patient.

Learn more about our contribution guide.

@keda-automation keda-automation requested a review from a team April 17, 2026 06:10
@snyk-io
Copy link
Copy Markdown

snyk-io Bot commented Apr 17, 2026

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@rickbrouwer
Copy link
Copy Markdown
Member

duplicate (#7655)

@rickbrouwer rickbrouwer added the duplicate This issue or pull request already exists label Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

duplicate This issue or pull request already exists

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Failed to set max procs - keda failing to startup

2 participants