Non-preemptible multi-device GPU memory jobs can exceed queue deserved quota

**What happened?**

When submitting a non-preemptible workload that requests **multiple GPU devices using GPU memory** (e.g., 2 devices × 6 GiB each), the scheduler allows the job to run even when doing so exceeds the queue's `deservedGPUs` non-preemptible quota.

**Steps to reproduce:**
1. Create a queue with `deservedGPUs: 1` (no hard limit)
2. Submit a non-preemptible workload requesting 2 GPU devices with 60% of the node GPU memory each (total = 1.2 GPU-fraction units)
3. Observe the workload is scheduled, consuming 1.2 GPU-fraction units of non-preemptible quota against a queue deserving only 1.0

---

**What did you expect to happen?**

A non-preemptible workload whose total GPU consumption across all requested devices exceeds the queue's `deservedGPUs` quota should remain `Pending`, consistent with the behavior of other non-preemptible workloads that exceed quota.

---

**Environment**

- Kubernetes version: v1.34
- KAI Scheduler version: v0.14.0
- Tools: GPU sharing / fractional GPU feature must be enabled

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-preemptible multi-device GPU memory jobs can exceed queue deserved quota #1369

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Non-preemptible multi-device GPU memory jobs can exceed queue deserved quota #1369

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions