Skip to content

Conversation

@renovate
Copy link
Contributor

@renovate renovate bot commented Jan 29, 2026

This PR contains the following updates:

Package Update Change
ghcr.io/home-operations/charts-mirror/dcgm-exporter minor 4.7.14.8.0

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

@deepsource-io
Copy link
Contributor

deepsource-io bot commented Jan 29, 2026

Here's the code health analysis summary for commits 9e0546e..ba375c0. View details on DeepSource ↗.

Analysis Summary

AnalyzerStatusSummaryLink
DeepSource JavaScript LogoJavaScript✅ SuccessView Check ↗
DeepSource Shell LogoShell✅ SuccessView Check ↗

💡 If you’re a repository administrator, you can configure the quality gates from the settings.

@tanguille-cluster
Copy link

tanguille-cluster bot commented Jan 29, 2026

--- HelmRelease: observability/dcgm-exporter ConfigMap: observability/exporter-metrics-config-map

+++ HelmRelease: observability/dcgm-exporter ConfigMap: observability/exporter-metrics-config-map

@@ -47,12 +47,13 @@

     # DCGM_FI_DEV_LOW_UTIL_VIOLATION,    counter, Throttling duration due to low utilization (in ns).
     # DCGM_FI_DEV_RELIABILITY_VIOLATION, counter, Throttling duration due to reliability constraints (in ns).
 
     # Memory usage
     DCGM_FI_DEV_FB_FREE, gauge, Framebuffer memory free (in MiB).
     DCGM_FI_DEV_FB_USED, gauge, Framebuffer memory used (in MiB).
+    DCGM_FI_DEV_FB_RESERVED, gauge, Framebuffer memory reserved (in MiB).
 
     # ECC
     # DCGM_FI_DEV_ECC_SBE_VOL_TOTAL, counter, Total number of single-bit volatile ECC errors.
     # DCGM_FI_DEV_ECC_DBE_VOL_TOTAL, counter, Total number of double-bit volatile ECC errors.
     # DCGM_FI_DEV_ECC_SBE_AGG_TOTAL, counter, Total number of single-bit persistent ECC errors.
     # DCGM_FI_DEV_ECC_DBE_AGG_TOTAL, counter, Total number of double-bit persistent ECC errors.
@@ -75,18 +76,21 @@

 
     # Remapped rows
     DCGM_FI_DEV_UNCORRECTABLE_REMAPPED_ROWS, counter, Number of remapped rows for uncorrectable errors
     DCGM_FI_DEV_CORRECTABLE_REMAPPED_ROWS,   counter, Number of remapped rows for correctable errors
     DCGM_FI_DEV_ROW_REMAP_FAILURE,           gauge,   Whether remapping of rows has failed
 
+    # Static configuration information. These appear as labels on the other metrics
+    DCGM_FI_DRIVER_VERSION,        label, Driver Version
+
     # DCP metrics
     DCGM_FI_PROF_GR_ENGINE_ACTIVE,   gauge, Ratio of time the graphics engine is active.
     # DCGM_FI_PROF_SM_ACTIVE,          gauge, The ratio of cycles an SM has at least 1 warp assigned.
     # DCGM_FI_PROF_SM_OCCUPANCY,       gauge, The ratio of number of warps resident on an SM.
     DCGM_FI_PROF_PIPE_TENSOR_ACTIVE, gauge, Ratio of cycles the tensor (HMMA) pipe is active.
     DCGM_FI_PROF_DRAM_ACTIVE,        gauge, Ratio of cycles the device memory interface is active sending or receiving data.
     # DCGM_FI_PROF_PIPE_FP64_ACTIVE,   gauge, Ratio of cycles the fp64 pipes are active.
     # DCGM_FI_PROF_PIPE_FP32_ACTIVE,   gauge, Ratio of cycles the fp32 pipes are active.
     # DCGM_FI_PROF_PIPE_FP16_ACTIVE,   gauge, Ratio of cycles the fp16 pipes are active.
-    DCGM_FI_PROF_PCIE_TX_BYTES,      counter, The number of bytes of active pcie tx data including both header and payload.
-    DCGM_FI_PROF_PCIE_RX_BYTES,      counter, The number of bytes of active pcie rx data including both header and payload.
+    DCGM_FI_PROF_PCIE_TX_BYTES,      gauge, The rate of data transmitted over the PCIe bus - including both protocol headers and data payloads - in bytes per second.
+    DCGM_FI_PROF_PCIE_RX_BYTES,      gauge, The rate of data received over the PCIe bus - including both protocol headers and data payloads - in bytes per second.
 
--- HelmRelease: observability/dcgm-exporter DaemonSet: observability/dcgm-exporter

+++ HelmRelease: observability/dcgm-exporter DaemonSet: observability/dcgm-exporter

@@ -61,13 +61,13 @@

             add:
             - SYS_ADMIN
             drop:
             - ALL
           runAsNonRoot: false
           runAsUser: 0
-        image: nvcr.io/nvidia/k8s/dcgm-exporter:4.4.2-4.7.1-ubuntu22.04
+        image: mirror.gcr.io/nvidia/dcgm-exporter:4.5.1-4.8.0-distroless
         imagePullPolicy: IfNotPresent
         args:
         - -f
         - /etc/dcgm-exporter/default-counters.csv
         env:
         - name: DCGM_EXPORTER_KUBERNETES
--- HelmRelease: observability/dcgm-exporter ServiceMonitor: observability/dcgm-exporter

+++ HelmRelease: observability/dcgm-exporter ServiceMonitor: observability/dcgm-exporter

@@ -19,10 +19,11 @@

     matchNames:
     - observability
   endpoints:
   - port: metrics
     path: /metrics
     interval: 15s
+    scrapeTimeout: 10s
     honorLabels: true
     relabelings: []
     metricRelabelings: []
 

@tanguille-cluster
Copy link

tanguille-cluster bot commented Jan 29, 2026

--- kubernetes/apps/observability/exporters/dcgm-exporter/app Kustomization: observability/dcgm-exporter HelmRelease: observability/dcgm-exporter

+++ kubernetes/apps/observability/exporters/dcgm-exporter/app Kustomization: observability/dcgm-exporter HelmRelease: observability/dcgm-exporter

@@ -38,12 +38,15 @@

           - matchExpressions:
             - key: extensions.talos.dev/nvidia-container-toolkit-production
               operator: Exists
     extraEnv:
       NVIDIA_DRIVER_CAPABILITIES: all
       NVIDIA_VISIBLE_DEVICES: all
+    image:
+      repository: mirror.gcr.io/nvidia/dcgm-exporter
+      tag: 4.5.1-4.8.0-distroless
     resources:
       limits:
         memory: 1024Mi
         nvidia.com/gpu: 1
       requests:
         memory: 128Mi
--- kubernetes/apps/observability/exporters/dcgm-exporter/app Kustomization: observability/dcgm-exporter OCIRepository: observability/dcgm-exporter

+++ kubernetes/apps/observability/exporters/dcgm-exporter/app Kustomization: observability/dcgm-exporter OCIRepository: observability/dcgm-exporter

@@ -10,9 +10,9 @@

 spec:
   interval: 5m
   layerSelector:
     mediaType: application/vnd.cncf.helm.chart.content.v1.tar+gzip
     operation: copy
   ref:
-    tag: 4.7.1
+    tag: 4.8.0
   url: oci://ghcr.io/home-operations/charts-mirror/dcgm-exporter
 

@renovate renovate bot force-pushed the renovate/ghcr.io-home-operations-charts-mirror-dcgm-exporter-4.x branch 2 times, most recently from 4e174d1 to 93809ed Compare January 29, 2026 20:26
@renovate renovate bot force-pushed the renovate/ghcr.io-home-operations-charts-mirror-dcgm-exporter-4.x branch from 93809ed to 347c86d Compare January 29, 2026 20:47
@Tanguille Tanguille merged commit 9ea6675 into main Jan 29, 2026
16 checks passed
@Tanguille Tanguille deleted the renovate/ghcr.io-home-operations-charts-mirror-dcgm-exporter-4.x branch January 29, 2026 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant