-
Notifications
You must be signed in to change notification settings - Fork 98
docs: Add missing plugin documentation for resource-strategy-fit, capacity,… #440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
… deviceshare, extender, nodegroup, usage, and resourcequota plugins Signed-off-by: Shive <shiverse94@gmail.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Summary of ChangesHello @shiverse94, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the Volcano scheduler documentation by introducing comprehensive guides for seven previously undocumented plugins. These additions aim to bridge the gap between existing scheduler functionalities and their official explanations, thereby improving user understanding and enabling more effective utilization of Volcano's advanced scheduling capabilities. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds much-needed documentation for seven scheduler plugins. The new documentation is well-structured and detailed, which will significantly help users understand and configure these plugins. I've provided a few suggestions to improve clarity and consistency in some of the new documents. Overall, this is a great contribution to improving Volcano's usability.
|
|
||
| #### Features | ||
|
|
||
| - **Queue Capacity Management**: Enforces queue capacity limits based on configured capability |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For better clarity and consistency with other parts of the documentation that refer to configuration fields, consider formatting capability as code to indicate it's a specific field name.
| - **Queue Capacity Management**: Enforces queue capacity limits based on configured capability | |
| - **Queue Capacity Management**: Enforces queue capacity limits based on configured `capability` |
| - **extender.allocateFuncVerb** (string): Verb for AllocateFunc event handler | ||
| - **extender.deallocateFuncVerb** (string): Verb for DeallocateFunc event handler | ||
| - **extender.ignorable** (bool): Whether the extender can ignore unexpected errors | ||
| - **extender.managedResources** (list): List of resources managed by the extender (comma-separated or list format) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description for extender.managedResources says it supports a "comma-separated or list format". The YAML example shows a list. To avoid confusion, could you please clarify if a comma-separated string is also a valid value for this argument in the YAML configuration? If it is, providing an example for that format would be helpful.
|
|
||
| ##### Configuration Parameters | ||
|
|
||
| - **strict** (bool): Enable strict mode. In strict mode, nodes without node group labels are rejected if the queue has affinity rules, and nodes with node group labels are rejected if the queue has no affinity rules. Default is `true`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The explanation for strict mode could be clearer. The phrase "nodes with node group labels are rejected if the queue has no affinity rules" is a bit counter-intuitive. It would be helpful to add more context or a rationale for this behavior, for example, by explaining that this ensures that general-purpose queues do not land on specialized node groups by default.
| - Preferred affinity rules are soft constraints and affect scoring | ||
| - Anti-affinity rules take precedence over affinity rules | ||
| - In hierarchical queues, child queues inherit affinity rules from their nearest ancestor with affinity rules | ||
| - When hierarchical queues are enabled, set `enableHierarchy: true` in the plugin configuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This note about enableHierarchy: true might be misleading, as this setting is part of the capacity plugin, not the nodegroup plugin. To avoid confusion, I suggest clarifying that hierarchical queue support for node group affinity inheritance depends on the capacity plugin being configured with enableHierarchy: true.
| - When hierarchical queues are enabled, set `enableHierarchy: true` in the plugin configuration | |
| - In hierarchical queues, child queues inherit affinity rules. This feature requires the `capacity` plugin to be enabled with `enableHierarchy: true` in the scheduler configuration. |
| - **weight**: Weight for this resource in scoring calculation | ||
| - **sra** (optional): SRA configuration: | ||
| - **enable**: Enable/disable SRA | ||
| - **resources**: Comma-separated list of resources for SRA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description says "Comma-separated list of resources", but the configuration example on line 56 only shows a single resource (resources: nvidia.com/gpu). This could be confusing. To make it clearer for users how to specify multiple resources, please consider updating the example to show a comma-separated list, for example: resources: "nvidia.com/gpu,amd.com/gpu". This also applies to proportional.resources described on line 81.
| 4. **Scoring Formula**: The score is calculated as: | ||
| - CPU score: `(100 - cpuUsage) / 100 * cpuWeight` | ||
| - Memory score: `(100 - memoryUsage) / 100 * memoryWeight` | ||
| - Combined score: `(cpuScore + memoryScore) / (cpuWeight + memoryWeight) * usageWeight * MaxNodeScore` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Threshold values are percentages (0-100) | ||
| - Weights determine the relative importance of different resources in scoring | ||
| - When predicate is disabled, the plugin only affects node scoring, not filtering | ||
| - The plugin uses average usage metrics over a configured period |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a slight inconsistency in the description of the metrics used. The "Overview" and "How It Works" sections refer to "current resource utilization", while this note mentions "average usage metrics over a configured period". To avoid confusion, could you clarify the relationship between these? For example, is the "current" utilization actually a recent average?
Signed-off-by: Shive <shiverse94@gmail.com>
What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
/kind documentation
What this PR does / why we need it:
Adds documentation for 7 missing scheduler plugins that exist in the volcano-sh/volcano repository but were not documented on the website. This aligns the website documentation with available plugins.
Which issue(s) this PR fixes:
Docs(Plugins): Synchronize plugin documentation with the main volcano repository #427
Addresses missing plugin documentation to improve user understanding and usage of available Volcano scheduler plugins.
// @JesseStutler