Skip to content

[Proposal] scheduler estimator reservation#7416

Merged
karmada-bot merged 1 commit intokarmada-io:masterfrom
XiShanYongYe-Chang:scheduler-estimator-reservation
May 7, 2026
Merged

[Proposal] scheduler estimator reservation#7416
karmada-bot merged 1 commit intokarmada-io:masterfrom
XiShanYongYe-Chang:scheduler-estimator-reservation

Conversation

@XiShanYongYe-Chang
Copy link
Copy Markdown
Member

What type of PR is this?

/kind feature
/kind documentation

What this PR does / why we need it:

ref #6783

Which issue(s) this PR fixes:

Fixes #6783

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

Copilot AI review requested due to automatic review settings April 18, 2026 09:47
@karmada-bot karmada-bot added kind/feature Categorizes issue or PR as related to a new feature. kind/documentation Categorizes issue or PR as related to documentation. labels Apr 18, 2026
@karmada-bot karmada-bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Apr 18, 2026
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses the resource over-commitment issue in the Karmada scheduler caused by the architectural latency between scheduling decisions and the actual reflection of resource consumption in the estimator's snapshot. By implementing a reservation mechanism that tracks in-flight scheduling decisions and deducts them from available cluster capacity, the system ensures more accurate resource estimation and prevents scheduling conflicts in high-throughput scenarios.

Highlights

  • Scheduler-side Reservation Cache: Introduced a reservation cache in the karmada-scheduler to track submitted-but-not-yet-landed resource commitments, preventing over-commitment during rapid scheduling cycles.
  • Estimator-side Reservation Deduction: Extended the estimator gRPC interface to propagate reserved workload information, allowing the estimator to deduct these resources using a First Fit (FF) algorithm before calculating available capacity.
  • Reservation Lifecycle Management: Implemented a robust reservation release mechanism based on Work object health status, with a TTL-based garbage collection fallback to ensure stale reservations are automatically cleaned up.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@XiShanYongYe-Chang
Copy link
Copy Markdown
Member Author

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a proposal for multi-cluster workload scheduling with resource reservation to mitigate resource over-commitment during rapid scheduling. It details a scheduler-side reservation cache and estimator-side deduction using a First Fit algorithm. The review feedback points out a missing "Namespace" field in the "ReservedWorkload" structure required for namespaced resource quota checks and highlights an inconsistency in the documentation concerning the reservation release trigger.

Comment thread docs/proposals/scheduling/estimator-reservation/README.md Outdated
Comment thread docs/proposals/scheduling/estimator-reservation/README.md Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new scheduling proposal document describing a “reservation” mechanism to avoid stale resource snapshots in karmada-scheduler-estimator during rapid consecutive aggregated scheduling decisions (ref #6783).

Changes:

  • Introduces a detailed design proposal for scheduler-side reservation caching and estimator-side deduction via extended gRPC requests.
  • Documents reservation lifecycle, release strategy options, risks/mitigations, and a test plan.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/proposals/scheduling/estimator-reservation/README.md Outdated
Comment thread docs/proposals/scheduling/estimator-reservation/README.md Outdated
Comment thread docs/proposals/scheduling/estimator-reservation/README.md Outdated
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 18, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 41.92%. Comparing base (e669d0d) to head (fc8341f).
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #7416   +/-   ##
=======================================
  Coverage   41.92%   41.92%           
=======================================
  Files         879      879           
  Lines       54328    54328           
=======================================
+ Hits        22778    22779    +1     
+ Misses      29828    29827    -1     
  Partials     1722     1722           
Flag Coverage Δ
unittests 41.92% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@XiShanYongYe-Chang XiShanYongYe-Chang force-pushed the scheduler-estimator-reservation branch from 1ff5134 to c8b8d4b Compare April 18, 2026 12:36
Copy link
Copy Markdown
Member

@mszacillo mszacillo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just looked over this proposal and left a few thoughts. Thank you for writing this up, it is incredibly thorough!

Out of curiosity, do we think its worth adding a note on how this feature will work with the node autoscaler support for estimator being discussed: #7375? The implementation is probably not too different from the other estimators discussed in this proposal, but perhaps worth mentioning.

Comment thread docs/proposals/scheduling/estimator-reservation/README.md
Comment thread docs/proposals/scheduling/estimator-reservation/README.md Outdated
@XiShanYongYe-Chang
Copy link
Copy Markdown
Member Author

Out of curiosity, do we think its worth adding a note on how this feature will work with the node autoscaler support for estimator being discussed: #7375? The implementation is probably not too different from the other estimators discussed in this proposal, but perhaps worth mentioning.

Thanks @mszacillo let me do it.

Comment thread docs/proposals/scheduling/estimator-reservation/README.md
Comment thread docs/proposals/scheduling/estimator-reservation/README.md Outdated
Comment thread docs/proposals/scheduling/estimator-reservation/README.md
Comment thread docs/proposals/scheduling/estimator-reservation/README.md Outdated
Comment thread docs/proposals/scheduling/estimator-reservation/README.md Outdated
@XiShanYongYe-Chang XiShanYongYe-Chang force-pushed the scheduler-estimator-reservation branch from c8b8d4b to 99e7bfc Compare April 22, 2026 02:22
@XiShanYongYe-Chang
Copy link
Copy Markdown
Member Author

/retest

@XiShanYongYe-Chang XiShanYongYe-Chang force-pushed the scheduler-estimator-reservation branch from 99e7bfc to 08b05a6 Compare April 23, 2026 08:00
@karmada-bot karmada-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 23, 2026
@XiShanYongYe-Chang XiShanYongYe-Chang force-pushed the scheduler-estimator-reservation branch from 08b05a6 to d0a5f03 Compare April 23, 2026 08:29
@XiShanYongYe-Chang
Copy link
Copy Markdown
Member Author

/retest

@XiShanYongYe-Chang XiShanYongYe-Chang force-pushed the scheduler-estimator-reservation branch from d0a5f03 to 05b1b49 Compare April 24, 2026 03:29
Copy link
Copy Markdown
Member

@RainbowMango RainbowMango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, it looks good to me.
Just briefly summarize what we need to do with this proposal here(ensure we are all on the same page):

  • Maintain a list of pending ResourceBindings' in the karmada-scheduler` cache
    • Append ResourceBinding to the list after its scheduling is completed.
    • Remove the ResourceBinding from the list when its status becomes healthy.
    • Set a TTL to ensure that the ResourceBinding won't stay in the list forever.
  • When evaluating available replicas, the karmada-scheduler passes this list to the estimator.
  • When the estimator evaluates resources, it first deducts the resources required by these pending ResourceBindings.

PS: Although several further optimization plans are listed in the secondary-verification-en.md file, none of them are ideal. So, this attempt will be excluded from this proposal. (A separate proposal is needed when there is a need.)

@mszacillo Michas, do you have any further comments or concerns?

Comment thread docs/proposals/scheduling/estimator-reservation/README.md Outdated
Comment thread docs/proposals/scheduling/estimator-reservation/README.md Outdated
Comment thread docs/proposals/scheduling/estimator-reservation/README.md Outdated
Comment thread docs/proposals/scheduling/estimator-reservation/secondary-verification-en.md Outdated
Comment thread docs/proposals/scheduling/estimator-reservation/README.md Outdated
@XiShanYongYe-Chang XiShanYongYe-Chang force-pushed the scheduler-estimator-reservation branch from 05b1b49 to 0c6dfe8 Compare April 29, 2026 08:22
@karmada-bot karmada-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 29, 2026
@XiShanYongYe-Chang XiShanYongYe-Chang force-pushed the scheduler-estimator-reservation branch 2 times, most recently from fb71dca to 378afb4 Compare April 30, 2026 06:06
@XiShanYongYe-Chang
Copy link
Copy Markdown
Member Author

All comments have been updated ~

Comment thread docs/proposals/scheduling/estimator-reservation/README.md
@mszacillo
Copy link
Copy Markdown
Member

Did one more pass over the proposal, overall it looks good! Thank you for addressing all the comments @XiShanYongYe-Chang. I left a single nit comment, and will review the secondary verification proposal when ready.

Let me know when you create the umbrella item for this proposal! I'd be happy to contribute.

Signed-off-by: changzhen <changzhen5@huawei.com>
@XiShanYongYe-Chang XiShanYongYe-Chang force-pushed the scheduler-estimator-reservation branch from 8a286f1 to fc8341f Compare May 7, 2026 08:48
Copy link
Copy Markdown
Member

@RainbowMango RainbowMango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@karmada-bot karmada-bot added the lgtm Indicates that a PR is ready to be merged. label May 7, 2026
@karmada-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RainbowMango

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@karmada-bot karmada-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 7, 2026
@RainbowMango RainbowMango added this to the v1.18 milestone May 7, 2026
@karmada-bot karmada-bot merged commit 03b39dd into karmada-io:master May 7, 2026
17 of 18 checks passed
@XiShanYongYe-Chang
Copy link
Copy Markdown
Member Author

Did one more pass over the proposal, overall it looks good! Thank you for addressing all the comments @XiShanYongYe-Chang. I left a single nit comment, and will review the secondary verification proposal when ready.

Let me know when you create the umbrella item for this proposal! I'd be happy to contribute.

Hi @mszacillo, I created an umbrella issue to track the task. Welcome to claim the tasks you are interested in. If there are any missing tasks, please feel free to point them out. Thank you. :)

@XiShanYongYe-Chang XiShanYongYe-Chang deleted the scheduler-estimator-reservation branch May 7, 2026 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/documentation Categorizes issue or PR as related to documentation. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stale Resource Assessment in karmada-scheduler-estimator Causes Incorrect Placement During Concurrent Aggregated Scheduling

7 participants