libct: fix resetting CPU affinity#5025
Conversation
|
how about @askervin |
NumCPU() returns the number of CPUs usable by the current process. The purpose of the Besides, I would avoid adding any magic values (like 1024) to this logic. If, for instance, unix.CPUSet would be changed to [64]uint64 instead of current [16]uint64, the current workaround calling SchedGetaffinity() would start working 4096-CPU systems, or if unix.SchedGetaffinity/SchedSetaffinity would be updated to work with dynamic (large enough) cpuset sizes, then this fix would work as is. With magic numbers we would have introduced only a new place that needs to be fixed at some point. |
|
I'm not sure this completely fixes #5023 -- yes, it stops the reset issue but still leaves you with the same problem that #4858 was trying to solve. If you are explicitly requesting CPUs >= 1024 then you will still not be able to get them AFAICS because we still use I think we should just be calling For what it's worth, I think even the current behaviour of resetting to use the first 1024 CPUs by default is better than regressing #4858. |
c92d043 to
be8dbc6
Compare
|
@cyphar, I think you're right: why making a quick fix when the proper fix is not really that much harder. Updated. Playing as safe as the latest go runtime in the size of the CPU mask. |
be8dbc6 to
8618a16
Compare
kolyshkin
left a comment
There was a problem hiding this comment.
a single nit, otherwise LGTM
8618a16 to
016fac8
Compare
|
Hmm, should we try to fix unix.CPUSet instead? |
I'm afraid trying to make unix.CPUSet dynamic will break lots of code. At least I can't come up with any nice change that would make it just work without modifications in applications. Yet the code that uses it is already broken in 1024+ CPU systems, at least trivial changes that would make it dynamic, can easily break it in smaller systems. One possibility could be adding unix.CPUSetS, following the spirit in the C library and macros We could perhaps practice it in runc internal or libcontainer, and then propose to unix.CPUSetS when we are happy and runc works fine? How this would sound to you? A prototype of CPUSetS is in the topmost commit (WIP: dynamic cpuset) in this branch: |
|
FWIW I opened https://go.dev/cl/727540 and https://go.dev/cl/727541 and later replaced these two with https://go-review.googlesource.com/c/sys/+/735380 which is currently in review. If/when approved it can be reused to solve this. |
This is great, @kolyshkin, thanks! |
kolyshkin
left a comment
There was a problem hiding this comment.
Given the fact that https://go-review.googlesource.com/c/sys/+/735380 is stuck in review and we want 1.4.1 out, I'm going to give this one a go (plus, most probably, we'll still need retryOnEINTR wrapper anyway).
Just a nit to SchedSetaffinity
016fac8 to
9a0a6cc
Compare
unix.CPUSet is limited to 1024 CPUs. Calling unix.SchedSetaffinity(pid, cpuset) removes all CPUs starting from 1024 from allowed CPUs of pid, even if cpuset is all ones. As a consequence, when runc tries to reset CPU affinity to "allow all" by default, it prevents all containers from CPUs 1024 onwards. This change uses a huge CPU mask to play safe and get all possible CPUs enabled with a single sched_setaffinity call. Fixes: opencontainers#5023 Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
9a0a6cc to
3307324
Compare
unix.CPUSet is limited to 1024 CPUs. Calling
unix.SchedSetaffinity(pid, cpuset) removes all CPUs starting from 1024 from allowed CPUs of pid, even if cpuset is all ones. The consequence of runc trying to reset CPU affinity by default is that it prevents all containers from using those CPUs.
This change is a quick fix that brings runc behavior back to what it was in v1.3.0 in 1024+ CPU systems. Real fix requires calling sched_setaffinity with cpusetsize fitting all CPUs in the system, which cannot be done with current unix.SchedSetaffinity.
Fixes: #5023