Skip to content

Graceful Workload Shuffling #1352

@NVJCameron

Description

@NVJCameron

What you would like to be added?

When the KAI scheduler moves a running container from one node to another, I want the option that the original container is not shut down until the new one is up and ready.

Why is this needed?

Inference workloads are often serving requests and can't just be shut off without repercussions. These workloads are never "not" doing work. We need to defragment cluster resources without impacting our users.

Metadata

Metadata

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions