Skip to content

Update CPU topology discovery to support ARM#106

Merged
k8s-ci-robot merged 6 commits intokubernetes-sigs:mainfrom
pravk03:arm-support
Apr 13, 2026
Merged

Update CPU topology discovery to support ARM#106
k8s-ci-robot merged 6 commits intokubernetes-sigs:mainfrom
pravk03:arm-support

Conversation

@pravk03
Copy link
Copy Markdown
Contributor

@pravk03 pravk03 commented Apr 1, 2026

  • Refactor the CPU discovery mechanism from using /proc/cpuinfo to sysfs based approach (/sys/devices/system/cpu)
  • Implemented a fallback for L3 cache discovery when cache id file is not present
  • Updated SMT detection to correctly handle notimplemented status

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 1, 2026
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 1, 2026
@pravk03 pravk03 force-pushed the arm-support branch 2 times, most recently from c0fa408 to 523caf5 Compare April 2, 2026 01:32
@pravk03
Copy link
Copy Markdown
Contributor Author

pravk03 commented Apr 2, 2026

Verified this locally on the ARM machine I have access to, but we need to ensure the underlying logic covers all diverse ARM cases and we are not missing anything.

/cc @ffromani @catblade
/hold

@pravk03 pravk03 marked this pull request as ready for review April 2, 2026 01:37
@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 2, 2026
@k8s-ci-robot k8s-ci-robot requested a review from ffromani April 2, 2026 01:37
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@pravk03: GitHub didn't allow me to request PR reviews from the following users: catblade.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

Verified this locally on the ARM machine I had access to.

/cc @ffromani @catblade
/hold

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 2, 2026
@k8s-ci-robot k8s-ci-robot requested a review from pohly April 2, 2026 01:37
Copy link
Copy Markdown
Contributor

@AutuSnow AutuSnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pravk03 I see that in the new code, if there is no difference between CoreID and SocketID, the default value of -1 will still be retained. The old code will make a judgment, and if it is less than 0, it will be skipped. Should we also retain this boundary judgment method?

@pravk03 pravk03 mentioned this pull request Apr 2, 2026
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 2, 2026
@pravk03
Copy link
Copy Markdown
Contributor Author

pravk03 commented Apr 2, 2026

Good point @AutuSnow. Updated to add the boundary check back. We currently just log errors and exclude the CPU from the ResourceSlice.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 3, 2026
Comment thread pkg/cpuinfo/cpuinfo.go Outdated
clusterID, _ := strconv.Atoi(strings.TrimSpace(clusterStr))
cpuInfo.ClusterID = clusterID
} else {
cpuInfo.ClusterID = 0 // Default to 0 if not present
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember on x86 systems, the ClusterID of all CPUs is set to 0. Will this lead to false sharing or incorrect core counts, and should we use the default value of -1 to indicate unknown

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. -1 is a better default. Updated.
On x86, the cluster_id file is present but reports 65535 to indicate unknown. I am also setting that to -1.

@ffromani
Copy link
Copy Markdown
Contributor

ffromani commented Apr 3, 2026

Maybe we discussed this in the past but still: my main concern is to have a reliable testing pipeline in place which exercises the code. Are there good enough (or at all?) ARM machines we can leverage from github actions and/or from prow?

I understand the requirements and I agree is nice to support ARM, but without CI in place the support itself will be brittle and unreliable, not great for users.

@fmuyassarov
Copy link
Copy Markdown
Member

I understand the requirements and I agree is nice to support ARM, but without CI in place the support itself will be brittle and unreliable, not great for users.

One option, is to request CNCF service desk for ARM machines from Oracle Cloud as they have done some donation to CNCF.

Akamai pledged $700K in cloud credits to the CNCF for project use. Projects can now request access to onboard Linode. This platform provides another alternative to Equinix which is due to sunset later this year. Oracle Cloud credits also continue to be available, particularly for AI workloads on ARM and bare metal deployments. Request access to either service via the Service Desk.

@fmuyassarov
Copy link
Copy Markdown
Member

and then we could even setup self hosted github runners or things like that

@pravk03
Copy link
Copy Markdown
Contributor Author

pravk03 commented Apr 3, 2026

I am thinking about how we handle this in kubelet today and whether we can follow the same approach here. I believe kubelet relies on cAdvisor for all topology information. For the long term, would it make sense for us to adopt the same mechanism here?

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 3, 2026
@pravk03
Copy link
Copy Markdown
Contributor Author

pravk03 commented Apr 8, 2026

Once kubernetes/test-infra#36640 merges, I think we can have arm e2e tests on prow (example)

Copy link
Copy Markdown
Contributor

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial review, will need another pass. Nothing concerning so far and some nice improvements worth merging anyway.
It seems we have a bit of chicken/egg problem: we can't enable ARM CI without this PR, but we can't test this PR without CI OR without developers getting access to ARM boxes, which is unpractical.

But I'm positive about this PR, so we can probably bootstrap merging this code first.

Comment thread pkg/cpuinfo/cpuinfo.go Outdated
Comment thread pkg/cpuinfo/cpuinfo.go Outdated
Comment on lines +291 to +294
// On many x86 systems, the kernel reports 65535 to indicate "No Cluster".
// We treat this as -1 to mean unknown/not applicable.
if clusterID == 65535 {
cpuInfo.ClusterID = -1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this behavior documented somewhere in the kernel docs? if so please add a link.
It would also be nice to move the magic numbers as module-private constants.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comment and added link to kernel documentation

Comment thread pkg/cpuinfo/cpuinfo.go Outdated
Comment thread pkg/cpuinfo/cpuinfo.go Outdated
Comment thread pkg/cpuinfo/cpuinfo.go Outdated
Comment thread pkg/cpuinfo/cpuinfo.go Outdated
Comment thread pkg/cpuinfo/cpuinfo.go
return nil
}

// TODO: Handle more complex sibling relationships (e.g. 4-way SMT) if needed in the future. For now we only handle 2-way hyperthreading which is the most common case.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree: let's note this may happen but let's not implement if we can't test it on real HW

@fmuyassarov
Copy link
Copy Markdown
Member

From a quick test run on my Pi, everything looked correct except I noticed that the NUMA id is missing.

  ownerReferences:
  - apiVersion: v1
    controller: true
    kind: Node
    name: dra-driver-cpu-control-plane
    uid: 4806a758-aa47-4075-8088-a63328e3d1fb
  resourceVersion: "1421"
  uid: e3d80168-473d-4b01-bcd3-adf7cd910253
spec:
  devices:
  - attributes:
      dra.cpu/cacheL3ID:
        int: -1
      dra.cpu/coreID:
        int: 0
      dra.cpu/coreType:
        string: standard
      dra.cpu/cpuID:
        int: 0
      dra.cpu/numaNodeID:
        int: 0
      dra.cpu/smtEnabled:
        bool: false
      dra.cpu/socketID:
        int: 0
      dra.net/numaNode:
        int: 0
    name: cpudev000
  - attributes:
      dra.cpu/cacheL3ID:
        int: -1
      dra.cpu/coreID:
        int: 1
      dra.cpu/coreType:
        string: standard
      dra.cpu/cpuID:
        int: 1
      dra.cpu/numaNodeID:
        int: 0
      dra.cpu/smtEnabled:
        bool: false
      dra.cpu/socketID:
        int: 0
      dra.net/numaNode:
        int: 0
    name: cpudev001
  - attributes:
      dra.cpu/cacheL3ID:
        int: -1
      dra.cpu/coreID:
        int: 2
      dra.cpu/coreType:
        string: standard
      dra.cpu/cpuID:
        int: 2
      dra.cpu/numaNodeID:
        int: 0
      dra.cpu/smtEnabled:
        bool: false
      dra.cpu/socketID:
        int: 0
      dra.net/numaNode:
        int: 0
    name: cpudev002
  - attributes:
      dra.cpu/cacheL3ID:
        int: -1
      dra.cpu/coreID:
        int: 3
      dra.cpu/coreType:
        string: standard
      dra.cpu/cpuID:
        int: 3
      dra.cpu/numaNodeID:
        int: 0
      dra.cpu/smtEnabled:
        bool: false
      dra.cpu/socketID:
        int: 0
      dra.net/numaNode:
        int: 0
    name: cpudev003
  driver: dra.cpu
  nodeName: dra-driver-cpu-control-plane
$ lscpu
Architecture:                aarch64
  CPU op-mode(s):            32-bit, 64-bit
  Byte Order:                Little Endian
CPU(s):                      4
  On-line CPU(s) list:       0-3
Vendor ID:                   ARM
  Model name:                Cortex-A72
    Model:                   3
    Thread(s) per core:      1
    Core(s) per cluster:     4
    Socket(s):               -
    Cluster(s):              1
    Stepping:                r0p3
    CPU(s) scaling MHz:      100%
    CPU max MHz:             1800.0000
    CPU min MHz:             600.0000
    BogoMIPS:                108.00
    Flags:                   fp asimd evtstrm crc32 cpuid
Caches (sum of all):
  L1d:                       128 KiB (4 instances)
  L1i:                       192 KiB (4 instances)
  L2:                        1 MiB (1 instance)
NUMA:
  NUMA node(s):              2
  NUMA node0 CPU(s):         0-3
  NUMA node1 CPU(s):         0-3
...
$ numactl --hardware && echo "---" && lscpu | grep -E 'Socket|NUMA|Core|Thread'
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3
node 0 size: 1853 MB
node 0 free: 140 MB
node 1 cpus: 0 1 2 3
node 1 size: 1943 MB
node 1 free: 94 MB
node distances:
node     0    1
   0:   10   10
   1:   10   10
---
Thread(s) per core:                      1
Core(s) per cluster:                     4
Socket(s):                               -
NUMA node(s):                            2
NUMA node0 CPU(s):                       0-3
NUMA node1 CPU(s):                       0-3
Defaulted container "dracpu" out of: dracpu, enable-nri-and-cdi (init)
I0409 13:49:05.294122    2405 app.go:236] dracpu go go1.25.9 build: fa54c505a210940fdbc0290a77e8bb85cef48826 time: 2026-04-09T12:51:01Z
I0409 13:49:05.294362    2405 app.go:118] FLAG: --add_dir_header="false"
I0409 13:49:05.294399    2405 app.go:118] FLAG: --alsologtostderr="false"
I0409 13:49:05.294408    2405 app.go:118] FLAG: --bind-address=":8080"
I0409 13:49:05.294420    2405 app.go:118] FLAG: --cpu-device-mode="individual"
I0409 13:49:05.294433    2405 app.go:118] FLAG: --group-by="numanode"
I0409 13:49:05.294444    2405 app.go:118] FLAG: --hostname-override=""
I0409 13:49:05.294453    2405 app.go:118] FLAG: --kubeconfig=""
I0409 13:49:05.294460    2405 app.go:118] FLAG: --log_backtrace_at=":0"
I0409 13:49:05.294484    2405 app.go:118] FLAG: --log_dir=""
I0409 13:49:05.294493    2405 app.go:118] FLAG: --log_file=""
I0409 13:49:05.294500    2405 app.go:118] FLAG: --log_file_max_size="1800"
I0409 13:49:05.294516    2405 app.go:118] FLAG: --logtostderr="true"
I0409 13:49:05.294523    2405 app.go:118] FLAG: --one_output="false"
I0409 13:49:05.294530    2405 app.go:118] FLAG: --reserved-cpus=""
I0409 13:49:05.294537    2405 app.go:118] FLAG: --skip_headers="false"
I0409 13:49:05.294544    2405 app.go:118] FLAG: --skip_log_headers="false"
I0409 13:49:05.294552    2405 app.go:118] FLAG: --stderrthreshold="2"
I0409 13:49:05.294562    2405 app.go:118] FLAG: --v="4"
I0409 13:49:05.294579    2405 app.go:118] FLAG: --vmodule=""
I0409 13:49:05.296637    2405 envvar.go:172] "Feature gate default state" feature="InOrderInformers" enabled=true
I0409 13:49:05.296717    2405 envvar.go:172] "Feature gate default state" feature="InOrderInformersBatchProcess" enabled=true
I0409 13:49:05.296772    2405 envvar.go:172] "Feature gate default state" feature="InformerResourceVersion" enabled=true
I0409 13:49:05.296793    2405 envvar.go:172] "Feature gate default state" feature="WatchListClient" enabled=true
I0409 13:49:05.296807    2405 envvar.go:172] "Feature gate default state" feature="ClientsAllowCBOR" enabled=false
I0409 13:49:05.296832    2405 envvar.go:172] "Feature gate default state" feature="ClientsPreferCBOR" enabled=false
I0409 13:49:05.302406    2405 draplugin.go:602] "Starting"
I0409 13:49:05.303261    2405 nonblockinggrpcserver.go:90] "GRPC server started" logger="dra" endpoint="/var/lib/kubelet/plugins/dra.cpu/dra.sock"
I0409 13:49:05.303719    2405 nonblockinggrpcserver.go:90] "GRPC server started" logger="registrar" endpoint="/var/lib/kubelet/plugins_registry/dra.cpu-reg.sock"
I0409 13:49:07.304262    2405 cdi.go:76] Initialized CDI file manager for "/var/run/cdi/dra.cpu.json"
time="2026-04-09T13:49:07Z" level=info msg="Created plugin 00-dra.cpu (dracpu, handles CreateContainer,StopContainer,RemoveContainer)"
I0409 13:49:07.304877    2405 app.go:204] driver started
I0409 13:49:07.305027    2405 dra_hooks.go:223] Publishing resources
I0409 13:49:07.305879    2405 resourceslicecontroller.go:538] "Starting ResourceSlice informer and waiting for it to sync" logger="ResourceSlice controller"
I0409 13:49:07.306187    2405 reflector.go:367] "Starting reflector" logger="ResourceSlice controller" type="*v1.ResourceSlice" resyncPeriod="0s" reflector="pkg/mod/k8s.io/client-go@v0.35.0/tools/cache/reflector.go:289"
I0409 13:49:07.306252    2405 reflector.go:411] "Listing and watching" logger="ResourceSlice controller" type="*v1.ResourceSlice" reflector="pkg/mod/k8s.io/client-go@v0.35.0/tools/cache/reflector.go:289"
time="2026-04-09T13:49:07Z" level=info msg="Registering plugin 00-dra.cpu..."
time="2026-04-09T13:49:07Z" level=info msg="Configuring plugin 00-dra.cpu for runtime containerd/v2.2.0..."
time="2026-04-09T13:49:07Z" level=info msg="Started plugin 00-dra.cpu..."
I0409 13:49:07.360822    2405 reflector.go:978] "Exiting watch because received the bookmark that marks the end of initial events stream" logger="ResourceSlice controller" reflector="pkg/mod/k8s.io/client-go@v0.35.0/tools/cache/reflector.go:289" totalItems=1 duration="54.187257ms"
I0409 13:49:07.361135    2405 reflector.go:446] "Caches populated" logger="ResourceSlice controller" type="*v1.ResourceSlice" reflector="pkg/mod/k8s.io/client-go@v0.35.0/tools/cache/reflector.go:289"
I0409 13:49:07.387422    2405 nri_hooks.go:33] Synchronized state with the runtime (10 pods, 10 containers)...
I0409 13:49:07.387791    2405 nri_hooks.go:41] Synchronize pod kube-system/kindnet-gc4wj UID 93c7fefd-9cb7-4688-9457-c13e5671ba19
I0409 13:49:07.387889    2405 nri_hooks.go:41] Synchronize pod kube-system/coredns-7d764666f9-j8ldv UID b3521ec7-baa2-4d34-b630-1fd7ba302d30
I0409 13:49:07.387908    2405 nri_hooks.go:41] Synchronize pod kube-system/coredns-7d764666f9-8grpj UID 5fdb8242-071d-4a7d-84e3-d074fd4be08f
I0409 13:49:07.387923    2405 nri_hooks.go:41] Synchronize pod kube-system/dracpu-2nxh6 UID b8c35ec3-4666-40e6-9fd6-eca33a9038c6
I0409 13:49:07.387936    2405 nri_hooks.go:41] Synchronize pod kube-system/kube-apiserver-dra-driver-cpu-control-plane UID 37ed9f0908e60821052c00fd207c2940
I0409 13:49:07.387950    2405 nri_hooks.go:41] Synchronize pod local-path-storage/local-path-provisioner-67b8995b4b-drswn UID fb74fb34-6064-43a4-ad28-bf74b59ac871
I0409 13:49:07.387961    2405 nri_hooks.go:41] Synchronize pod kube-system/etcd-dra-driver-cpu-control-plane UID 1f74daaa1d7133221e2c745d85d4a2dc
I0409 13:49:07.387972    2405 nri_hooks.go:41] Synchronize pod kube-system/kube-proxy-q8gwb UID a809b7a5-298d-48b7-8bd3-1c1d61dbbf1c
I0409 13:49:07.387984    2405 nri_hooks.go:41] Synchronize pod kube-system/kube-controller-manager-dra-driver-cpu-control-plane UID e6452df7fa0fae1030da6ad72e966c17
I0409 13:49:07.388004    2405 nri_hooks.go:41] Synchronize pod kube-system/kube-scheduler-dra-driver-cpu-control-plane UID 0f01c9d9e97c323e3aa548ef8e0e1a1f
I0409 13:49:08.306950    2405 resourceslicecontroller.go:553] "ResourceSlice informer has synced" logger="ResourceSlice controller"
I0409 13:49:08.307061    2405 resourceslicecontroller.go:184] "Starting" logger="ResourceSlice controller"

@pravk03
Copy link
Copy Markdown
Contributor Author

pravk03 commented Apr 9, 2026

Thanks for testing the changes locally @fmuyassarov. Taking a look.

@fmuyassarov
Copy link
Copy Markdown
Member

Thanks for testing the changes locally @fmuyassarov. Taking a look.

To add here as well as we discussed with @pravk03 offline. I realized from the output of numactl, that the same CPU belongs to more than one NUMA

NUMA node0 CPU(s):   0-3
NUMA node1 CPU(s):   0-3

This isn't a real NUMA setup I would say. Normally each CPU belongs to exactly one NUMA node, but here all 4 cores appear in both nodes which shouldn't happen. Although it is weird - it is common for Raspberry Pi 4 (BCM2711 SoC) which exposes two regions in its physical address map, a low memory window and a high memory window. Basically, it was not the best idea to try out the changes on the Pi :D. Sorry for confusion @pravk03

@pravk03
Copy link
Copy Markdown
Contributor Author

pravk03 commented Apr 9, 2026

/retest

@pravk03
Copy link
Copy Markdown
Contributor Author

pravk03 commented Apr 10, 2026

The flaky presubmit should be fixed by #116

pravk03 added 5 commits April 11, 2026 22:25
The current mechanism of parsing `/proc/cpuinfo` does not work on ARM architectures as SocketID and CoreID is missing
On ARM, id file is missing from `devices/system/cpu/cpuX/cache/index3`.
We fall backto using the smallest CPU ID in the shared_cpu_list at that cache level.
Copy link
Copy Markdown
Contributor

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides a terminology thingy which we can fix later, LGTM
This is a nice improvement in how we extract the topology data from the system I'd merge anyway even without arm support.

Comment thread pkg/cpuinfo/cpuinfo.go Outdated
@ffromani
Copy link
Copy Markdown
Contributor

/approve
/lgtm

@pravk03 your call if you want to address the terminology item here, later (or at all?). Unhold when you prefer.
I'm fine merging this PR first and enable ARM in CI later: I'm fine merging because the x86 improvements, but to claim ARM support we need to have CI, eventually.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 13, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ffromani, pravk03

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 13, 2026
@AutuSnow
Copy link
Copy Markdown
Contributor

/lgtm
/cc @ffromani

@k8s-ci-robot k8s-ci-robot requested a review from ffromani April 13, 2026 15:43
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 13, 2026
@pravk03
Copy link
Copy Markdown
Contributor Author

pravk03 commented Apr 13, 2026

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 13, 2026
@k8s-ci-robot k8s-ci-robot merged commit 1736c9f into kubernetes-sigs:main Apr 13, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants