Skip to content

feat: add HardwareDetector interface and measurement keys for NFD integration#482

Merged
mchmarny merged 6 commits intoNVIDIA:mainfrom
ArangoGutierrez:feat/nfd-infrastructure
Apr 3, 2026
Merged

feat: add HardwareDetector interface and measurement keys for NFD integration#482
mchmarny merged 6 commits intoNVIDIA:mainfrom
ArangoGutierrez:feat/nfd-infrastructure

Conversation

@ArangoGutierrez
Copy link
Copy Markdown
Contributor

@ArangoGutierrez ArangoGutierrez commented Apr 2, 2026

Summary

Task 0 of NFD Snapshot Enrichment (Track A).

Adds shared infrastructure for day-0 GPU hardware detection:

  • Measurement keys: gpu-present, driver-loaded, detection-source (with KeyGPU* prefix convention)
  • Timeout constant: NFDDetectionTimeout (5s)
  • HardwareDetector interface and HardwareInfo type (including DetectionSource field)

Note

The NFD Go dependency (sigs.k8s.io/node-feature-discovery) and PCI/NFD constants
(vendor ID, device classes, source names) are deferred to Task 1 when code imports
NFD packages. go mod tidy removes unused imports, and golangci-lint flags
unused constants, so both are best added alongside their first consumer.

Testing

  • go build ./... passes
  • All existing tests pass with -race
  • Error-path test case validates nil-safety of HardwareDetector mock

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

Welcome to AICR, @ArangoGutierrez! Thanks for your first pull request.

Before review, please ensure:

  • All commits are signed off per the DCO
  • CI checks pass (tests, lint, security scan)
  • The PR description explains the why behind your changes

A maintainer will review this soon.

@ArangoGutierrez ArangoGutierrez force-pushed the feat/nfd-infrastructure branch from b420e88 to 3770f88 Compare April 2, 2026 18:05
@github-actions github-actions bot added size/XL and removed size/M labels Apr 2, 2026
@ArangoGutierrez ArangoGutierrez force-pushed the feat/nfd-infrastructure branch 2 times, most recently from 4e9a4a7 to 96d81fc Compare April 2, 2026 18:30
@ArangoGutierrez ArangoGutierrez marked this pull request as ready for review April 2, 2026 18:39
@ArangoGutierrez ArangoGutierrez requested a review from a team as a code owner April 2, 2026 18:39
…Detector interface

Add NFD API types dependency (sigs.k8s.io/node-feature-discovery/api/nfd@v0.18.3)
for GPU hardware detection without driver dependency.

New measurement keys: KeyGPUPresent, KeyDriverLoaded, KeyDetectionSource
in pkg/measurement/types.go for NFD-based hardware detection results.

New timeout: NFDDetectionTimeout (5s) in pkg/defaults/timeouts.go for
local sysfs/procfs operations (PCI enumeration and kernel module listing).

New interface: HardwareDetector with HardwareInfo type in
pkg/collector/gpu/hardware.go for abstracting GPU hardware detection.
PCI and NFD constants deferred to Task 1 when they are consumed.

New tests: TestHardwareDetectorInterface validates interface contract
with mock implementation. NFDDetectionTimeout added to TestTimeoutConstants.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
mchmarny

This comment was marked as resolved.

mchmarny

This comment was marked as resolved.

The blank import of sigs.k8s.io/node-feature-discovery/api/nfd/v1alpha1
executes K8s scheme registration init() functions in every binary.
The NFD dependency will be added in Task 1 when actual code uses it.

Addresses reviewer feedback: critical finding.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Aligns the struct with the KeyGPUDetectionSource measurement key
so implementations don't need to pass detection source out-of-band.

Addresses reviewer feedback: important finding.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Adds 'detection failure' test case exercising wantErr=true with nil info.
Adds early return after error check to prevent nil-pointer dereference.
Adds DetectionSource field assertions to existing test cases.

Addresses reviewer feedback: important finding.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
KeyDriverLoaded -> KeyGPUDriverLoaded
KeyDetectionSource -> KeyGPUDetectionSource

String values unchanged. Aligns with existing KeyGPUDriver, KeyGPUModel,
KeyGPUCount naming convention.

Addresses reviewer feedback: minor finding.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
@github-actions github-actions bot added size/M and removed size/XL labels Apr 3, 2026
@ArangoGutierrez ArangoGutierrez requested a review from mchmarny April 3, 2026 12:09
Copy link
Copy Markdown
Member

@mchmarny mchmarny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the fixes

/lgtm

@mchmarny mchmarny merged commit ba20188 into NVIDIA:main Apr 3, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants