RFC: Admission controller proof of concept#27491
Draft
mismithhisler wants to merge 1 commit intomainfrom
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
DO NOT MERGE
As a part of some investigative work around Nomad fair scheduling in resource constrained clusters, we decided to take a look at what fair resource sharing would look like as an implementation of a generic admission controller. Our initial implementation of this admission controller pushed jobs to a queue and kept track of reserved cpu and memory at the namespace level (although we had plans for more configurable tenancy models). When resources for the configured node pool became constrained, and jobs begin to queue, a simple fairsharing algorithm would decide which one to run next (Using the
SkipEvalCreationflag, a controller could mutate the job so it was stored in Raft but not run until the controller forces an evaluation).Using a configuration like the below would configure the nomad server to forward job
Fairsharing algorithms are inherently stateful which complicated the controller enough that it did not seem to provide much more benefit than submitting jobs directly to a queue (which is what most users do with this use case).
We are currently investigating other places in Nomad where this type of logic might fit better, but wanted to push up these changes in the event others may have a good use case or ideas for external admission controllers.
Request for comments
Please share if you could use admission controllers and if you have feedback on this design. While we do not have admission controllers on the roadmap yet, we have long intended to make admission-controller-like patterns easier to implement.
Feedback welcome!