Skip to content

Non-preemptible multi-device GPU memory jobs can exceed queue deserved quota #1369

@enoodle

Description

@enoodle

What happened?

When submitting a non-preemptible workload that requests multiple GPU devices using GPU memory (e.g., 2 devices × 6 GiB each), the scheduler allows the job to run even when doing so exceeds the queue's deservedGPUs non-preemptible quota.

Steps to reproduce:

  1. Create a queue with deservedGPUs: 1 (no hard limit)
  2. Submit a non-preemptible workload requesting 2 GPU devices with 60% of the node GPU memory each (total = 1.2 GPU-fraction units)
  3. Observe the workload is scheduled, consuming 1.2 GPU-fraction units of non-preemptible quota against a queue deserving only 1.0

What did you expect to happen?

A non-preemptible workload whose total GPU consumption across all requested devices exceeds the queue's deservedGPUs quota should remain Pending, consistent with the behavior of other non-preemptible workloads that exceed quota.


Environment

  • Kubernetes version: v1.34
  • KAI Scheduler version: v0.14.0
  • Tools: GPU sharing / fractional GPU feature must be enabled

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions