[WIP][FG:InPlacePodVerticalScaling] Consider exist CPU in topology manager when pod resize#10
Open
Chunxia202410 wants to merge 9 commits intoesotsal:policy_staticfrom
Open
[WIP][FG:InPlacePodVerticalScaling] Consider exist CPU in topology manager when pod resize#10Chunxia202410 wants to merge 9 commits intoesotsal:policy_staticfrom
Chunxia202410 wants to merge 9 commits intoesotsal:policy_staticfrom
Conversation
e46fef3 to
29c96e9
Compare
dc25a5a to
dcd3846
Compare
cb7ed3f to
d9af3c8
Compare
8e7e1ec to
f61ff7b
Compare
0153b60 to
09b2d95
Compare
4ce5b01 to
e626b44
Compare
67db105 to
bc0c6b1
Compare
6796c9f to
d4b3faf
Compare
5208b8c to
2f0571c
Compare
5c67c1c to
74ca59e
Compare
bc0c6b1 to
9f59c63
Compare
Use new topology.Allocation struct (a CPU set plus alignment metadata) instead of CPU set, due to rebase. Remove duplicate unecessary SetDefaultCPUSet call as per review comment.
- Revert introduction of API env mustKeepCPUs
- Replace mustKeepCPUs with local checkpoint "promised"
- Introduce "promised" in CPUManagerCheckpointV3 format
- Add logic, refactor with Beta candidate
- Fix lint issues
- Fail if mustKeepCPUs are not subset of resulted CPUs
- Fail if reusableCPUsForResize, mustKeepCPUs are not a subset
of aligned CPUs
- Fail if mustKeepCPUs are not a subset of reusable CPUs
- TODO improve align resize tests, go through testing, corner cases
refactor using cpumanager_test.go
- TODO improve CPUManagerCheckpointV3 tests
- TODO address code review/feedback to try different approach to allocate
stepwise instead of once off when resizing
- TODO check init-containers
- TODO check migration from v2 to v3 CPU Manager checkpoint
- TODO check kubectl failure when prohibited can this be done earlier?
- WIP update CPU Manager tests to use refactored cpu_manager_test
- TODO update topologymanager,cpumanager,memorymanager documentation
# Conflicts: # pkg/kubelet/cm/cpumanager/policy_static.go
|
Is this the new KEP? It looks very interesting, and I have some questions after reading some code:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
/kind bug
/kind api-change
What this PR does / why we need it:
When Guaranteed QoS Class Pod scale up and down with Static CPU management policy alongside InPlacePodVerticalScaling, the additional CPUs should near the existed CPUs.
This proposal consider existed CPUs in topology manager, that means before allocate additional CPUs, we can selecte the prefer numa nodes which near the existed CPUs (allocated CPUs).
There are two scope, container scope and pod scope.
For container scope:
When container scale up,
If allocated CPUs number + reusable CPUs number > request CPU number of the container, that means the reusable CPU is enough to allocate additional CPUs, so topologyHint generation based on the reusable CPUs, and allocated CPUs (existed CPUs).

If allocated CPUs number + reusable CPUs number < request CPU number of the container, that means all of the reusable CPU will allocate to this container, and the remaining CPUs will be allocated from available CPUs, topologyHint generation based on the available CPUs, and allocated CPUs + reusable CPUs.

Note: the reusable CPU will be first to allocate to the resize container in kubernetes#131966.
For pod scope:
A lot of containers may scale up and scale down at the same time, and the order of scale up and down uncertain, so the topologyHint generation based on the available CPUs and pod assigned CPUs + reusable CPUs whether it scale up and down.
Similar purpose with #3, but the difference is this is handled in CPU manager for previous, and this proposal is more simple as discussed in kubernetes#129719 (comment).
This implement is based on the code of kubernetes#129719, and kubernetes#131966.
Further Thinking

Should we need to consider below cases?
The reserved CPUs are {0,10}, the available CPU is {9,19}.
There are 2 containers of 1 pod, every container have 8 CPUs.
When container 0 scale up from 8 → 14, and container 1 scale down from 8 → 2 at the same time, it will failed because the container 0 scale up first, and there are not enough CPUs, and the container1 will not resize anymore.
The first proposal is: For every resource (CPU and memory), handle scale down resource first, and scale up resource later.
The Detail Design:
In Admit function of None Scope / Container Scope / Pod Scope, for previous code, there are only one loop to process all containers, and the process order as defined in the yaml feil.
Add another container loop.
[Update]
The second proposal is: If one container allocate resource failed, continue to allocate resource to other containers.
==> For this case solved by second proposal the process like: In first resize round, the container 0 failed and container 1 success. In second resize round, the container 0 success.
If this cases can be solved, the Pod scope topology hints step can be