volcano-sh · shiverse94 · Jan 18, 2026 · Jan 18, 2026 · gemini-code-assist · Jan 18, 2026
diff --git a/content/en/docs/plugins/capacity.md b/content/en/docs/plugins/capacity.md
@@ -0,0 +1,180 @@
++++
+title = "Capacity Plugin"
+
+date = 2025-01-21
+lastmod = 2025-01-21
+
+draft = false  # Is this a draft? true/false
+toc = true  # Show table of contents? true/false
+type = "docs"  # Do not modify.
+
+# Add menu entry to sidebar.
+linktitle = "Capacity"
+[menu.plugins]
+  weight = 2
++++
+
+### Capacity
+
+#### Overview
+
+The Capacity plugin manages queue resource allocation using a capacity-based model. It enforces queue capacity limits, guarantees minimum resource allocations, and supports hierarchical queue structures. The plugin calculates each queue's deserved resources based on its capacity, guarantee, and the cluster's total available resources.
+
+#### Features
+
+- **Queue Capacity Management**: Enforces queue capacity limits based on configured capability
- **Queue Capacity Management**: Enforces queue capacity limits based on configured capability
+- **Queue Capacity Management**: Enforces queue capacity limits based on configured `capability`
- **Queue Capacity Management**: Enforces queue capacity limits based on configured capability
+- **Queue Capacity Management**: Enforces queue capacity limits based on configured `capability`
+- **Resource Guarantees**: Supports minimum resource guarantees for queues
+- **Hierarchical Queues**: Supports hierarchical queue structures with parent-child relationships
+- **Dynamic Resource Allocation**: Calculates deserved resources dynamically based on queue configuration
+- **Resource Reclamation**: Supports resource reclamation from queues exceeding their capacity
+- **Job Enqueue Control**: Validates resource availability before allowing jobs to be enqueued
+
+#### Configuration
+
+The Capacity plugin is configured through Queue resources. Here's an example:
+
+```yaml
+apiVersion: scheduling.volcano.sh/v1beta1
+kind: Queue
+metadata:
+  name: queue-capacity-example
+spec:
+  weight: 1
+  capability:
+    cpu: "100"
+    memory: "100Gi"
+  guarantee:
+    resource:
+      cpu: "20"
+      memory: "20Gi"
+  deserved:
+    cpu: "50"
+    memory: "50Gi"
+```
+
+##### Queue Configuration Fields
+
+- **capability**: Maximum resources the queue can consume
+- **guarantee**: Minimum resources guaranteed to the queue
+- **deserved**: Desired resource allocation for the queue (calculated automatically if not specified)
+- **parent**: Parent queue name for hierarchical queue structures
+
+##### Hierarchical Queue Configuration
+
+```yaml
+apiVersion: scheduling.volcano.sh/v1beta1
+kind: Queue
+metadata:
+  name: root-queue
+spec:
+  weight: 1
+  capability:
+    cpu: "1000"
+    memory: "1000Gi"
+---
+apiVersion: scheduling.volcano.sh/v1beta1
+kind: Queue
+metadata:
+  name: child-queue
+spec:
+  parent: root-queue
+  weight: 1
+  capability:
+    cpu: "500"
+    memory: "500Gi"
+  guarantee:
+    resource:
+      cpu: "100"
+      memory: "100Gi"
+```
+
+#### How It Works
+
+1. **Capacity Calculation**: The plugin calculates each queue's real capacity by considering the total cluster resources, total guarantees, and the queue's own guarantee and capability.
+2. **Deserved Resources**: Deserved resources are calculated based on the queue's real capacity and configured deserved values.
+3. **Job Enqueue**: Before a job is enqueued, the plugin validates that the queue has sufficient capacity to accommodate the job's minimum resource requirements.
+4. **Resource Allocation**: During scheduling, the plugin ensures that queues don't exceed their allocated capacity.
+5. **Reclamation**: Queues that exceed their deserved resources can have tasks reclaimed to make room for other queues.
+
+#### Scenario
+
+The Capacity plugin is suitable for:
+
+- **Resource Quota Management**: Enforcing resource limits per queue or department
+- **Multi-tenant Clusters**: Isolating resources between different tenants or teams
+- **Resource Reservations**: Guaranteeing minimum resources for critical workloads
+- **Hierarchical Organizations**: Organizations with nested resource allocation structures
+
+#### Examples
+
+##### Example 1: Basic Capacity Management
+
+```yaml
+apiVersion: scheduling.volcano.sh/v1beta1
+kind: Queue
+metadata:
+  name: team-a
+spec:
+  weight: 1
+  capability:
+    cpu: "200"
+    memory: "200Gi"
+    nvidia.com/gpu: "8"
+  guarantee:
+    resource:
+      cpu: "50"
+      memory: "50Gi"
+      nvidia.com/gpu: "2"
+```
+
+##### Example 2: Hierarchical Capacity
+
+```yaml
+# Root queue
+apiVersion: scheduling.volcano.sh/v1beta1
+kind: Queue
+metadata:
+  name: root
+spec:
+  weight: 1
+  capability:
+    cpu: "1000"
+    memory: "1000Gi"
+
+---
+# Development queue
+apiVersion: scheduling.volcano.sh/v1beta1
+kind: Queue
+metadata:
+  name: dev
+spec:
+  parent: root
+  weight: 1
+  capability:
+    cpu: "300"
+    memory: "300Gi"
+
+---
+# Production queue
+apiVersion: scheduling.volcano.sh/v1beta1
+kind: Queue
+metadata:
+  name: prod
+spec:
+  parent: root
+  weight: 1
+  capability:
+    cpu: "500"
+    memory: "500Gi"
+  guarantee:
+    resource:
+      cpu: "200"
+      memory: "200Gi"
+```
+
+#### Notes
+
+- When hierarchical queues are enabled, only leaf queues can allocate tasks
+- Queues without a capacity configuration are treated as best-effort queues
+- The plugin automatically calculates real capacity considering parent queue constraints
+- Resource guarantees cannot exceed queue capabilities
diff --git a/content/en/docs/plugins/deviceshare.md b/content/en/docs/plugins/deviceshare.md
@@ -0,0 +1,193 @@
++++
+title = "Device Share Plugin"
+
+date = 2025-01-21
+lastmod = 2025-01-21
+
+draft = false  # Is this a draft? true/false
+toc = true  # Show table of contents? true/false
+type = "docs"  # Do not modify.
+
+# Add menu entry to sidebar.
+linktitle = "Device Share"
+[menu.plugins]
+  weight = 3
++++
+
+### Device Share
+
+#### Overview
+
+The Device Share plugin manages the sharing and allocation of device resources such as GPUs, NPUs, and other accelerators. It supports multiple device types including NVIDIA GPUs (both GPU sharing and vGPU), Ascend NPUs, and provides flexible scheduling policies for device allocation. The plugin enables efficient utilization of expensive accelerator resources through sharing capabilities.
+
+#### Features
+
+- **GPU Sharing**: Enable sharing of GPU resources among multiple pods
+- **GPU Number**: Schedule based on the number of GPUs requested
+- **vGPU Support**: Support for virtual GPU (vGPU) allocation
+- **Ascend NPU Support**: Support for Ascend NPU devices including MindCluster VNPU and HAMi VNPU
+- **Node Locking**: Optional node-level locking to prevent concurrent device allocations
+- **Flexible Scheduling Policies**: Configurable scoring policies for device allocation
+- **Batch Node Scoring**: Support for batch scoring of nodes for NPU devices
+
+#### Configuration
+
+The Device Share plugin can be configured with the following arguments:
+
+```yaml
+actions: "allocate, backfill"
+tiers:
+- plugins:
+  - name: deviceshare
+    arguments:
+      deviceshare.GPUSharingEnable: true
+      deviceshare.GPUNumberEnable: false
+      deviceshare.VGPUEnable: false
+      deviceshare.NodeLockEnable: false
+      deviceshare.SchedulePolicy: "binpack"
+      deviceshare.ScheduleWeight: 10
+      deviceshare.AscendMindClusterVNPUEnable: false
+      deviceshare.AscendHAMiVNPUEnable: false
+      deviceshare.KnownGeometriesCMName: "volcano-vgpu-device-config"
+      deviceshare.KnownGeometriesCMNamespace: "kube-system"
+```
+
+##### Configuration Parameters
+
+- **deviceshare.GPUSharingEnable** (bool): Enable GPU sharing mode
+- **deviceshare.GPUNumberEnable** (bool): Enable GPU number-based scheduling (mutually exclusive with GPUSharingEnable)
+- **deviceshare.VGPUEnable** (bool): Enable vGPU support (mutually exclusive with GPU sharing)
+- **deviceshare.NodeLockEnable** (bool): Enable node-level locking for device allocation
+- **deviceshare.SchedulePolicy** (string): Scheduling policy for device scoring (e.g., "binpack", "spread")
+- **deviceshare.ScheduleWeight** (int): Weight for device scoring in node ordering
+- **deviceshare.AscendMindClusterVNPUEnable** (bool): Enable Ascend MindCluster VNPU support
+- **deviceshare.AscendHAMiVNPUEnable** (bool): Enable Ascend HAMi VNPU support
+- **deviceshare.KnownGeometriesCMName** (string): ConfigMap name for vGPU geometries
+- **deviceshare.KnownGeometriesCMNamespace** (string): Namespace for vGPU geometries ConfigMap
+
+#### Device Types
+
+##### NVIDIA GPU Sharing
+
+Enable GPU sharing to allow multiple pods to share a single GPU:
+
+```yaml
+- name: deviceshare
+  arguments:
+    deviceshare.GPUSharingEnable: true
+    deviceshare.ScheduleWeight: 10
+```
+
+Pods request GPU resources using:
+
+```yaml
+resources:
+  requests:
+    nvidia.com/gpu: 2  # Request 2 GPU units (out of 100 per GPU)
+  limits:
+    nvidia.com/gpu: 2
+```
+
+##### NVIDIA GPU Number
+
+Schedule based on the number of physical GPUs:
+
+```yaml
+- name: deviceshare
+  arguments:
+    deviceshare.GPUNumberEnable: true
+    deviceshare.ScheduleWeight: 10
+```
+
+Pods request whole GPUs:
+
+```yaml
+resources:
+  requests:
+    nvidia.com/gpu: 1  # Request 1 whole GPU
+  limits:
+    nvidia.com/gpu: 1
+```
+
+##### vGPU
+
+Enable virtual GPU support:
+
+```yaml
+- name: deviceshare
+  arguments:
+    deviceshare.VGPUEnable: true
+    deviceshare.ScheduleWeight: 10
+    deviceshare.KnownGeometriesCMName: "volcano-vgpu-device-config"
+    deviceshare.KnownGeometriesCMNamespace: "kube-system"
+```
+
+##### Ascend NPU
+
+Enable Ascend NPU support:
+
+```yaml
+- name: deviceshare
+  arguments:
+    deviceshare.AscendMindClusterVNPUEnable: true
+    # or
+    deviceshare.AscendHAMiVNPUEnable: true
+    deviceshare.ScheduleWeight: 10
+```
+
+#### Scenario
+
+The Device Share plugin is suitable for:
+
+- **GPU Clusters**: Clusters with NVIDIA GPU resources requiring efficient sharing
+- **AI Training**: Machine learning training workloads requiring GPU acceleration
+- **Multi-tenant GPU Sharing**: Environments where multiple users need access to GPU resources
+- **NPU Workloads**: Workloads running on Ascend NPU devices
+- **Cost Optimization**: Maximizing utilization of expensive accelerator hardware
+
+#### Examples
+
+##### Example 1: GPU Sharing for Small Workloads
+
+Configure GPU sharing for workloads that don't require full GPU resources:
+
+```yaml
+- name: deviceshare
+  arguments:
+    deviceshare.GPUSharingEnable: true
+    deviceshare.SchedulePolicy: "binpack"
+    deviceshare.ScheduleWeight: 10
+```
+
+##### Example 2: Whole GPU Allocation
+
+Configure for workloads requiring full GPU resources:
+
+```yaml
+- name: deviceshare
+  arguments:
+    deviceshare.GPUNumberEnable: true
+    deviceshare.SchedulePolicy: "spread"
+    deviceshare.ScheduleWeight: 10
+```
+
+##### Example 3: vGPU with Custom ConfigMap
+
+Configure vGPU with custom geometry configuration:
+
+```yaml
+- name: deviceshare
+  arguments:
+    deviceshare.VGPUEnable: true
+    deviceshare.ScheduleWeight: 10
+    deviceshare.KnownGeometriesCMName: "custom-vgpu-config"
+    deviceshare.KnownGeometriesCMNamespace: "gpu-system"
+```
+
+#### Notes
+
+- GPU sharing and GPU number modes are mutually exclusive
+- GPU sharing and vGPU cannot be enabled simultaneously
+- Node locking prevents race conditions in device allocation
+- The plugin automatically registers supported devices based on configuration
+- Batch scoring is used for NPU devices to optimize allocation decisions