test: Accurate scaling strategy e2e test and functional tests for replica count defaults by jansworld · Pull Request #7607 · kedacore/keda

jansworld · 2026-04-06T03:02:55Z

Added a test for the accurate scaling strategy (within its own sub-directory so as to not modify the package name on the eager scaling strategy test). This test verifies both cases of the accurate scaling strategy. That is, when (maxScale + runningJobCount) <= maxReplicaCount, the number of new jobs created is maxScale - pendingJobCount (case 1). When (maxScale + runningJobCount) > maxReplicaCount, the number of new jobs created is maxReplicaCount - runningJobCount (case 2). We use a maxReplicaCount of 10.

*It's worth noting that since we want a long enough running job such that they stick around for a bit in order to test case 2, the decision was made to simply use sleeper pods instead of an actual message processor. To simulate message consumption, the queue is cleared, as this mimics the behavior of Azure Storage Queue's length property when a processor is consuming a message (locked messages not reported in queue length). If this is a problem, I can put some more time in and actually create a message processor that behaves similarly to the sleeper pod.

*Also note that these tests are performed with the pending job count as effectively 0.

Case 1: Send 4 messages into the queue. Wait for 4 jobs to be running. Clear the queue to simulate message consumption. Wait for all jobs to succeed.

Case 2: Send 4 messages into the queue. Wait for 4 jobs to be running. Clear the queue to simulate message consumption. Send 8 more messages into the queue. Assert that running jobs is clamped by maxReplicaCount (put differently, wait for 10 jobs to be running). This verifies the accurate strategy. Clean up by clearing the queue & waiting for all jobs to succeed.

Also added 3 functional tests confirming minReplicaCount and maxReplicaCount behavior in ScaledJobs. The first 2 tests verify default behavior, checking that the 2 fields evaluate to nil when omitted in the ScaledJob spec. The third test verifies that when minReplicaCount > maxReplicaCount, minReplicaCount is set to maxReplicaCount.

Potential additional e2e tests: default & custom strategies

Checklist

When introducing a new scaler, I agree with the scaling governance policy
I have verified that my change is according to the deprecations & breaking changes policy
Tests have been added (if applicable)
Ensure make generate-scalers-schema has been run to update any outdated generated files
Changelog has been updated and is aligned with our changelog requirements, only when the change impacts end users
A PR is opened to update our Helm chart (repo) (if applicable, ie. when deployment manifests are modified)
A PR is opened to update the documentation on (repo) (if applicable)
Commits are signed with Developer Certificate of Origin (DCO - learn more)

Fixes #3661

Relates to #

snyk-io · 2026-04-06T03:03:10Z

✅ Snyk checks have passed. No issues have been found so far.

Status	Scan Engine	Critical	High	Medium	Low	Total (0)
✅	Open Source Security	0	0	0	0	0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

github-actions · 2026-04-06T03:03:14Z

Thank you for your contribution! 🙏

Please understand that we will do our best to review your PR and give you feedback as soon as possible, but please bear with us if it takes a little longer as expected.

While you are waiting, make sure to:

Add an entry in our changelog in alphabetical order and link related issue
Update the documentation, if needed
Add unit & e2e tests for your changes
GitHub checks are passing
Is the DCO check failing? Here is how you can fix DCO issues

Once the initial tests are successful, a KEDA member will ensure that the e2e tests are run. Once the e2e tests have been successfully completed, the PR may be merged at a later date. Please be patient.

Learn more about our contribution guide.

rickbrouwer

Thanks for adding this test! Happy with it 🙂 My first feedback.

rickbrouwer · 2026-04-06T11:06:30Z

+
+	// Queue up 8 more messages to trigger the cap condition
+	enqueueMessages(ctx, t, client, 8)
+	assert.True(t, WaitForRunningJobCount(t, kc, scaledJobName, testNamespace, 10, iterationCount, 1),


I think there's a potential race condition in the cap condition test. After WaitForRunningJobCount returns, the pods may still be in Pending state, which affects KEDA's scaling decision when you put the 8 messages. What do you think, can we wait until the pods are actually Running?

@rickbrouwer Good point. I wasn't thinking about the underlying pod at the time. I should be able to fix this in the helper function. I'll probably rename the helper function to WaitForRunningPodCount instead of WaitForRunningJobCount due to the distinction you pointed out.

Yeah. I haven't looked properly myself yet (something about a lot of PRs to review lately), but could you check if WaitForAllPodRunningInNamespace might be used? Again, I'm not sure, but it's worth a check.

I thought about using that, but felt it didn't give enough assurance that only a certain number of jobs were created. Alternatively, would WaitForScaledJobCount followed by WaitForAllPodRunningInNamespace be preferable to WaitForRunningPodCount?

I think WaitForAllPodRunningInNamespace will work at first glance. You can adjust it and then test it locally to check. If you still find that difficult to test locally, you can also adjust it and I can start an e2e test here.

Sounds good. I can work that in after a WaitForScaledJobCount call to verify job count. I'll see about testing it locally as well. I'll just need to create a test Storage Account & Queue so the connection string actually goes somewhere. I should be able to get to that after work.

@rickbrouwer I wasn't able to test things locally, but I didn't want to hold this up. So, I pushed changes for the time being. If you could start the e2e test I would be very appreciative. I'm having some issues getting the ScaledJob to talk to Azurite at the moment when testing locally. This would probably work better with a real storage account, but I was trying to be frugal and not pay for one

@jansworld started here: #7607 (comment)

rickbrouwer · 2026-04-06T18:39:14Z

One test is now nicely presented in a folder. What do you think, would it be nice to give the other one its own folder as well?

zroubalik · 2026-04-08T18:14:36Z

/run-e2e internal
Update: You can check the progress here

zroubalik · 2026-04-09T06:24:34Z

/run-e2e internal
Update: You can check the progress here

rickbrouwer · 2026-04-09T07:36:53Z

https://productionresultssa13.blob.core.windows.net/actions-results/b2ce3748-0e6a-4236-af35-815c19aaac11/workflow-job-run-381e4f96-3737-59c0-aae4-a0eaa9a12d16/logs/job/job-logs.txt?rsct=text%2Fplain&se=2026-04-09T07%3A45%3A14Z&sig=Y6pfvQExbbOZYZThhtpelWTil9BZGfGXZjYbYXk0%2BT8%3D&ske=2026-04-09T10%3A43%3A55Z&skoid=ca7593d4-ee42-46cd-af88-8b886a2f84eb&sks=b&skt=2026-04-09T06%3A43%3A55Z&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skv=2025-11-05&sp=r&spr=https&sr=b&st=2026-04-09T07%3A35%3A09Z&sv=2025-11-05

Added tests to confirm that minReplicaCount and maxReplicaCount are nil when not specified in the ScaledJob. Added a test to confirm that when minReplicaCount is greater than maxReplicaCount, minReplicaCount is set to maxReplicaCount Issue kedacore#3661 Signed-off-by: jansworld <navon.josh@gmail.com>

Added test to verify accurate scaling strategy behavior. The test covers 2 cases. The first case checks that when maxScale + runningJobs <= maxReplicaCount, jobs created = maxScale - pendingJobs. Pending jobs here is 0. The second case checks that when maxScale + runningJobs > maxReplicaCount, jobs created is clamped by maxReplicaCount (maxReplicaCount - runningJobCount Issue kedacore#3661 Signed-off-by: jansworld <navon.josh@gmail.com>

Added an accurate sclaing strategy subdirectory test to avoid package naming conflicts with the eager test without modifying that go file. Issue kedacore#3661 Signed-off-by: jansworld <navon.josh@gmail.com>

Deleted the test after reading the code for MinReplicaCount() and realizing that nothing actually gets set. It simply returns maxReplicaCount in the case that the min is greater than the max. Something like this belongs in pkg/scaling/executor/scaled_job_test.go if anything. Issue kedacore#3661 Signed-off-by: jansworld <navon.josh@gmail.com>

…changes Renamed WaitForRunningJobCount to WaitForRunningPodCount to address the fact that despite a job showing as Running, its underlying pod could still be in a Pending state. So, we instead wait for the pods to be Running. The other change made was to have the helper return false if the number of pods in a Running state differs from the target, instead of if the count was greater than or equal to the target. This will help ensure test accuracy. Issue kedacore#3661 Signed-off-by: jansworld <navon.josh@gmail.com>

Fixed a mistake in the WaitForRunningPodCount helper function, where pods were targeted instead of Jobs, but the test for whether or not the pod was in a Running state used the wrong types to check Issue kedacore#3661 Signed-off-by: jansworld <navon.josh@gmail.com>

Issue kedacore#3661 Signed-off-by: jansworld <navon.josh@gmail.com>

Removed WaitForRunningPodCount and opted to go with WaitForScaledJobCount followed by WaitForAllPodRunningInNamespace since that keeps the changes a bit more concise and works for what this test needs. Issue kedacore#3661 Signed-off-by: jansworld <navon.josh@gmail.com>

…point of assertion for testing the accurate strategy I hastily assumed that a combination of WaitForScaledJobCount and WaitForAllPodRunningInNamespace would satisfy what is needed for the test. However, that approach comes with a major flaw. That is, unless you delete the jobs, those pods count towards the condition in WaitForAllPodRunningInNamespace. Thus, a much simpler way to test the strategy is to wait for the correct amount of pods to be running. A running pod in the given namespace implies a corresponding Job exists. So, we test the 2 strategy cases with pod count. This also passes local tests Issue kedacore#3661 Signed-off-by: jansworld <navon.josh@gmail.com>

rickbrouwer · 2026-04-10T04:39:19Z

/run-e2e internal
Update: You can check the progress here

Added WaitForScaledJobCount back into the checks since it allows for quicker deletion of the queue messages instead of having to wait for the pods to be running. Additionally, increased the job sleep time to 120 seconds in order to account for any slowness in dispatching jobs. In e2e tests that were kicked off, it looked like some jobs finished before the queue was cleared, causing the running pod count to differ from the expected value. This issue looks to be fixed with the increased sleep time in the pod. Issue kedacore#3661 Signed-off-by: jansworld <navon.josh@gmail.com>

jansworld · 2026-04-17T00:02:31Z

@rickbrouwer are you able to kick off the e2e tests when you have a moment?

rickbrouwer · 2026-04-17T05:44:16Z

/run-e2e internal
Update: You can check the progress here

jansworld requested a review from a team as a code owner April 6, 2026 03:03

keda-automation requested a review from a team April 6, 2026 03:03

rickbrouwer reviewed Apr 6, 2026

View reviewed changes

keda-automation requested a review from a team April 6, 2026 15:26

jansworld force-pushed the new-scaledjob-tests branch from 474993c to 9e8c90b Compare April 10, 2026 01:44

jansworld added 9 commits April 9, 2026 20:56

Added subdirectory for accurate scaling strategy test

4b022de

Added an accurate sclaing strategy subdirectory test to avoid package naming conflicts with the eager test without modifying that go file. Issue kedacore#3661 Signed-off-by: jansworld <navon.josh@gmail.com>

Moved eager scaling strategy test into its own folder

4f3f91d

Issue kedacore#3661 Signed-off-by: jansworld <navon.josh@gmail.com>

jansworld force-pushed the new-scaledjob-tests branch from 9e8c90b to 6283b40 Compare April 10, 2026 01:58

rickbrouwer approved these changes Apr 18, 2026

View reviewed changes

rickbrouwer added the Awaiting/2nd-approval This PR needs one more approval review label Apr 18, 2026

Conversation

jansworld commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

snyk-io Bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Snyk checks have passed. No issues have been found so far.

Uh oh!

github-actions Bot commented Apr 6, 2026

Uh oh!

rickbrouwer left a comment

Choose a reason for hiding this comment

Uh oh!

rickbrouwer Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

jansworld Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rickbrouwer Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

jansworld Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

rickbrouwer Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

jansworld Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

jansworld Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

zroubalik Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rickbrouwer commented Apr 6, 2026

Uh oh!

zroubalik commented Apr 8, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zroubalik commented Apr 9, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rickbrouwer commented Apr 9, 2026

Uh oh!

rickbrouwer commented Apr 10, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jansworld commented Apr 17, 2026

Uh oh!

rickbrouwer commented Apr 17, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jansworld commented Apr 6, 2026 •

edited

Loading

snyk-io Bot commented Apr 6, 2026 •

edited

Loading

jansworld Apr 6, 2026 •

edited

Loading

zroubalik commented Apr 8, 2026 •

edited by github-actions Bot

Loading

zroubalik commented Apr 9, 2026 •

edited by github-actions Bot

Loading

rickbrouwer commented Apr 10, 2026 •

edited by github-actions Bot

Loading

rickbrouwer commented Apr 17, 2026 •

edited by github-actions Bot

Loading