Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 0 additions & 41 deletions .eslintrc

This file was deleted.

27 changes: 27 additions & 0 deletions .eslintrc.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
env:
browser: true
es2021: true
extends:
- eslint:recommended
- plugin:react/recommended
- plugin:@typescript-eslint/recommended
- prettier
parser: '@typescript-eslint/parser'
parserOptions:
ecmaFeatures:
jsx: true
ecmaVersion: 2016
sourceType: module
plugins:
- prettier
- react
- '@typescript-eslint'
rules:
prettier/prettier:
- error
react/prop-types: off
settings:
react:
version: detect


3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ typings/

# parcel-bundler cache (https://parceljs.org/)
.cache
.cache_*

# Next.js build output
.next
Expand Down Expand Up @@ -102,3 +103,5 @@ dist

# TernJS port file
.tern-port

.vscode
6 changes: 2 additions & 4 deletions .prettierrc.yaml → .prettierrc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,5 @@ arrowParens: always
printWidth: 100
singleQuote: true
trailingComma: all
tabWidth: 2
semi: true
jsxSingleQuote: false
proseWrap: always


28 changes: 28 additions & 0 deletions .stylelintrc.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
extends:
- stylelint-config-standard
rules:
# Disallow color names and hex colors as these don't work well with dark mode.
# Use PF global variables instead:
# https://patternfly-react-main.surge.sh/developer-resources/global-css-variables#global-css-variables
color-named: never
color-no-hex: true
# PatternFly CSS vars don't conform to stylelint's regex. Disable this rule.
custom-property-pattern: null
function-disallowed-list:
- rgb
# Disable the standard rule to allow BEM-style classnames with underscores.
selector-class-pattern: null
# Disallow CSS classnames prefixed with .pf- or .co- as these prefixes are
# reserved by PatternFly and OpenShift console.
selector-disallowed-list:
- "*"
- /\.(pf|co)-/
# Plugins should avoid naked element selectors like `table` and `li` since
# this can impact layout of existing pages in console.
selector-max-type:
- 0
- ignore:
- compounded
- descendant


2 changes: 1 addition & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@
"**/dist": true
},
"editor.codeActionsOnSave": {
"source.fixAll.eslint": true
"source.fixAll.eslint": "explicit"
},
"editor.quickSuggestions": {
"comments": "on",
Expand Down
15 changes: 8 additions & 7 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@

FROM registry.access.redhat.com/ubi8/nodejs-16:latest AS builder
FROM registry.access.redhat.com/ubi9/nodejs-20:latest AS build
USER root
RUN command -v yarn || npm i -g yarn

COPY . /opt/app-root/src
RUN yarn install --frozen-lockfile && yarn build
ADD . /usr/src/app
WORKDIR /usr/src/app
RUN yarn install && yarn build

FROM registry.access.redhat.com/ubi9/nginx-120:latest
COPY default.conf "${NGINX_CONFIGURATION_PATH}"
COPY --from=builder /opt/app-root/src/dist .

COPY --from=build /usr/src/app/dist /usr/share/nginx/html
USER 1001
CMD /usr/libexec/s2i/run

ENTRYPOINT ["nginx", "-g", "daemon off;"]
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,16 @@ Dynamic plugin for the OpenShift console which adds GPU capabilities.

| NVIDIA GPU plugin | OCP Console |
| ---------------------- | ----------- |
| release-0.2.5/latest | 4.12-4.18 |
| release-0.2.4 | 4.11 |
| release-0.0.1 | 4.10 |
| main/latest | 4.19+ |
| 0.2.5 | 4.12-4.18 |
| 0.2.4 | 4.11 |
| 0.0.1 | 4.10 |

## QuickStart

### Prerequisites

- [Red Hat OpenShift](https://www.redhat.com/en/technologies/cloud-computing/openshift) 4.12-4.18
- [Red Hat OpenShift](https://www.redhat.com/en/technologies/cloud-computing/openshift) 4.19+ (for main/latest)
- [NVIDIA GPU operator](https://github.com/NVIDIA/gpu-operator)
- [Helm](https://helm.sh/docs/intro/install/)

Expand Down
14 changes: 0 additions & 14 deletions default.conf

This file was deleted.

2 changes: 1 addition & 1 deletion deployment/console-plugin-nvidia-gpu/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
apiVersion: v2
appVersion: latest
description: |
Red Hat OpenShift dynamic console plugin that leverages the NVIDIA GPU operator metrics and serves the respective console-extensions. Requires Red Hat OpenShift version 4.12+
Red Hat OpenShift dynamic console plugin that leverages the NVIDIA GPU operator metrics and serves the respective console-extensions. Requires Red Hat OpenShift version 4.19+
name: console-plugin-nvidia-gpu
sources:
- https://github.com/rh-ecosystem-edge/console-plugin-nvidia-gpu
Expand Down
2 changes: 1 addition & 1 deletion deployment/console-plugin-nvidia-gpu/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ in order to serve the respective [console-extensions](https://github.com/openshi

### Prerequisites

- [Red Hat OpenShift](https://www.redhat.com/en/technologies/cloud-computing/openshift) 4.12-4.18
- [Red Hat OpenShift](https://www.redhat.com/en/technologies/cloud-computing/openshift) 4.19+
- [NVIDIA GPU operator](https://github.com/NVIDIA/gpu-operator)
- [Helm](https://helm.sh/docs/intro/install/)

Expand Down
77 changes: 66 additions & 11 deletions deployment/console-plugin-nvidia-gpu/templates/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,69 @@ metadata:
{{- include "console-plugin-nvidia-gpu.labels" . | nindent 4 }}
data:
dcgm-metrics.csv: |
DCGM_FI_PROF_GR_ENGINE_ACTIVE, gauge, gpu utilization.
DCGM_FI_DEV_MEM_COPY_UTIL, gauge, mem utilization.
DCGM_FI_DEV_ENC_UTIL, gauge, enc utilization.
DCGM_FI_DEV_DEC_UTIL, gauge, dec utilization.
DCGM_FI_DEV_POWER_USAGE, gauge, power usage.
DCGM_FI_DEV_POWER_MGMT_LIMIT_MAX, gauge, power mgmt limit.
DCGM_FI_DEV_GPU_TEMP, gauge, gpu temp.
DCGM_FI_DEV_SM_CLOCK, gauge, sm clock.
DCGM_FI_DEV_MAX_SM_CLOCK, gauge, max sm clock.
DCGM_FI_DEV_MEM_CLOCK, gauge, mem clock.
DCGM_FI_DEV_MAX_MEM_CLOCK, gauge, max mem clock.
# This configuration replaces the default DCGM metrics included in the dcgm-exporter image.
# The default metrics are defined in /etc/dcgm-exporter/dcp-metrics-included.csv
#
# Note: When a custom metrics ConfigMap is provided, it completely overrides the defaults.
# This file includes all metrics from the default set plus additional metrics required
# by the console plugin.
#
# Default metrics may change between dcgm-exporter versions. Last synced with:
# dcgm-exporter default dcp-metrics-included.csv

# Clocks
DCGM_FI_DEV_SM_CLOCK, gauge, SM clock frequency (in MHz).
DCGM_FI_DEV_MEM_CLOCK, gauge, Memory clock frequency (in MHz).

# Temperature
DCGM_FI_DEV_MEMORY_TEMP, gauge, Memory temperature (in C).
DCGM_FI_DEV_GPU_TEMP, gauge, GPU temperature (in C).

# Power
DCGM_FI_DEV_POWER_USAGE, gauge, Power draw (in W).
DCGM_FI_DEV_TOTAL_ENERGY_CONSUMPTION, counter, Total energy consumption since boot (in mJ).

# PCIE
DCGM_FI_DEV_PCIE_REPLAY_COUNTER, counter, Total number of PCIe retries.

# Utilization
DCGM_FI_DEV_GPU_UTIL, gauge, GPU utilization (in %).
DCGM_FI_DEV_MEM_COPY_UTIL, gauge, Memory utilization (in %).
DCGM_FI_DEV_ENC_UTIL, gauge, Encoder utilization (in %).
DCGM_FI_DEV_DEC_UTIL, gauge, Decoder utilization (in %).

# Errors and violations
DCGM_FI_DEV_XID_ERRORS, gauge, Value of the last XID error encountered.

# Memory usage
DCGM_FI_DEV_FB_FREE, gauge, Framebuffer memory free (in MiB).
DCGM_FI_DEV_FB_USED, gauge, Framebuffer memory used (in MiB).
DCGM_FI_DEV_FB_RESERVED, gauge, Framebuffer memory reserved (in MiB).

# NVLink
DCGM_FI_DEV_NVLINK_BANDWIDTH_TOTAL, counter, Total number of NVLink bandwidth counters for all lanes.

# VGPU License status
DCGM_FI_DEV_VGPU_LICENSE_STATUS, gauge, vGPU License status

# Remapped rows
DCGM_FI_DEV_UNCORRECTABLE_REMAPPED_ROWS, counter, Number of remapped rows for uncorrectable errors
DCGM_FI_DEV_CORRECTABLE_REMAPPED_ROWS, counter, Number of remapped rows for correctable errors
DCGM_FI_DEV_ROW_REMAP_FAILURE, gauge, Whether remapping of rows has failed

# Static configuration (labels)
DCGM_FI_DRIVER_VERSION, label, Driver Version

# DCP metrics
DCGM_FI_PROF_GR_ENGINE_ACTIVE, gauge, Ratio of time the graphics engine is active.
DCGM_FI_PROF_PIPE_TENSOR_ACTIVE, gauge, Ratio of cycles the tensor (HMMA) pipe is active.
DCGM_FI_PROF_DRAM_ACTIVE, gauge, Ratio of cycles the device memory interface is active sending or receiving data.
DCGM_FI_PROF_PCIE_TX_BYTES, gauge, The rate of data transmitted over the PCIe bus - including both protocol headers and data payloads - in bytes per second.
DCGM_FI_PROF_PCIE_RX_BYTES, gauge, The rate of data received over the PCIe bus - including both protocol headers and data payloads - in bytes per second.

# ========================================
# Additional metrics required by console plugin (not in default dcp-metrics-included.csv):
# ========================================
DCGM_FI_DEV_POWER_MGMT_LIMIT_MAX, gauge, Maximum power management limit.
DCGM_FI_DEV_MAX_SM_CLOCK, gauge, Maximum SM clock.
DCGM_FI_DEV_MAX_MEM_CLOCK, gauge, Maximum memory clock.
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ metadata:
{{- include "console-plugin-nvidia-gpu.labels" . | nindent 4 }}
spec:
displayName: 'Console Plugin NVIDIA GPU Template'
i18n:
loadType: Preload
backend:
type: Service
service:
Expand Down
18 changes: 13 additions & 5 deletions deployment/console-plugin-nvidia-gpu/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,31 +23,39 @@ spec:
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.podSecurityContext }}
securityContext:
runAsNonRoot: true
{{- toYaml . | nindent 8 }}
{{- end }}
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
{{- with .Values.containerSecurityContext }}
securityContext:
allowPrivilegeEscalation: false
{{- toYaml . | nindent 10 }}
{{- end }}
ports:
- containerPort: 9443
protocol: TCP
volumeMounts:
- name: plugin-serving-cert
readOnly: true
mountPath: /var/serving-cert
mountPath: /var/cert
- name: nginx-conf
readOnly: true
mountPath: /etc/nginx/nginx.conf
subPath: nginx.conf
resources:
{{- toYaml .Values.resources | nindent 12 }}
{{- toYaml .Values.resources | nindent 10 }}
volumes:
- name: plugin-serving-cert
secret:
secretName: plugin-serving-cert
defaultMode: 420
- name: nginx-conf
configMap:
name: nginx-conf
name: {{ include "console-plugin-nvidia-gpu.fullname" . }}-config
defaultMode: 420
restartPolicy: Always
dnsPolicy: ClusterFirst
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "console-plugin-nvidia-gpu.fullname" . }}-config
labels:
{{- include "console-plugin-nvidia-gpu.labels" . | nindent 4 }}
data:
nginx.conf: |
error_log /dev/stdout info;
events {}
http {
access_log /dev/stdout;
include /etc/nginx/mime.types;
default_type application/octet-stream;
keepalive_timeout 65;
server {
listen {{ .Values.plugin.port }} ssl;
listen [::]:{{ .Values.plugin.port }} ssl;
ssl_certificate /var/cert/tls.crt;
ssl_certificate_key /var/cert/tls.key;
root /usr/share/nginx/html;
}
}

Loading
Loading