refactor(scheduler): remove Resource fields from PodInfo and PodGroup… by enoodle · Pull Request #1238 · kai-scheduler/KAI-Scheduler

enoodle · 2026-03-18T09:33:31Z

…Info, remove JobRequirement

Description

Remove deprecated Resource-based fields and types:

PodInfo: remove ResReq, AcceptedResource; add GpuRequirement, AcceptedGpuRequirement
PodGroupInfo: remove Allocated, tasksToAllocateInitResource
Remove JobRequirement struct entirely, replace with ResourceVector/ResourceQuantities

All resource operations now use vectors exclusively.

Self-reviewed
Added/updated tests (if needed)
Updated documentation (if needed)

Fixes #1354

Summary by CodeRabbit

Refactor
- Restructured internal resource management to improve GPU resource allocation and tracking. Optimized resource requirement handling for better scalability and maintainability.

coderabbitai · 2026-03-18T09:33:39Z

📝 Walkthrough

Walkthrough

This pull request refactors the scheduler's resource representation model by separating GPU-specific resource requirements from general resource handling. Core structs like PodInfo and PodGroupInfo are updated to replace generic ResourceRequirements pointers with dedicated GpuResourceRequirement fields, while non-GPU resources are represented as vectors. The JobRequirement type is removed entirely, and method signatures throughout the codebase are updated to accept the new GPU-focused parameters.

Changes

Cohort / File(s)	Summary
PodInfo Struct Refactoring `pkg/scheduler/api/pod_info/pod_info.go`, `pkg/scheduler/api/pod_info/pod_info_test.go`, `pkg/scheduler/api/pod_info/pod_info_benchmark_test.go`	Removed `ResReq` and `AcceptedResource` fields; added `GpuRequirement` and `AcceptedGpuRequirement` fields. Updated initialization, cloning, and string representation to use new GPU-centric fields. Added `rebuildResReqVector` method to reconstruct vectors from GPU requirements.
PodGroupInfo & JobRequirement Changes `pkg/scheduler/api/podgroup_info/job_info.go`, `pkg/scheduler/api/podgroup_info/job_info_test.go`, `pkg/scheduler/api/podgroup_info/allocation_info.go`, `pkg/scheduler/api/podgroup_info/allocation_info_test.go`	Removed `JobRequirement` type entirely. Removed `Allocated` field from `PodGroupInfo`, replaced with vector-based `AllocatedVector`. Updated allocation logic to use `GpuRequirement` accessors and vector operations instead of `ResReq` fields.
NodeInfo GPU Methods `pkg/scheduler/api/node_info/node_info.go`, `pkg/scheduler/api/node_info/gpu_sharing_node_info.go`	Updated method signatures: `IsTaskFitOnGpuGroup`, `EnoughIdleResourcesOnGpu`, `GetResourceGpuMemory`, etc. now accept `GpuResourceRequirement` instead of `ResourceRequirements`. Updated internal logic to use `GpuRequirement` accessors throughout.
Error Handling & Resource Utilities `pkg/scheduler/api/common_info/pod_errors.go`, `pkg/scheduler/api/common_info/pod_errors_test.go`, `pkg/scheduler/api/resource_info/resource_info.go`	Refactored `NewFitErrorInsufficientResource` to accept separate `GpuResourceRequirement` and `ResourceVector` parameters. Added `AddVectorAndGpuReq` and `SubVectorAndGpuReq` methods to `Resource` type for hybrid resource accounting.
Scheduler Framework & Plugins `pkg/scheduler/cache/cache.go`, `pkg/scheduler/framework/session.go`, `pkg/scheduler/framework/statement_test.go`, `pkg/scheduler/framework/statement_checkpoint_test.go`, `pkg/scheduler/framework/statement_test_utils.go`, `pkg/scheduler/gpu_sharing/gpuSharing.go`, `pkg/scheduler/plugins/gpusharingorder/gpusharingorder.go`, `pkg/scheduler/plugins/predicates/predicates.go`, `pkg/scheduler/plugins/proportion/proportion.go`, `pkg/scheduler/k8s_internal/predicates/maxNodeResources.go`	Updated calls to GPU-related methods and accessors to use `GpuRequirement` fields instead of `ResReq`. Changed references from `task.ResReq.GPUs()` to `task.GpuRequirement.GPUs()` and similar patterns. Updated test assertions to check vector-based resources.
Proportion Plugin & Quota Checks `pkg/scheduler/plugins/proportion/capacity_policy/capacity_policy.go`, `pkg/scheduler/plugins/proportion/capacity_policy/capacity_policy_test.go`, `pkg/scheduler/plugins/proportion/capacity_policy/quota_check.go`, `pkg/scheduler/plugins/proportion/queue_order/queue_order_test.go`, `pkg/scheduler/plugins/proportion/proportion_test.go`, `pkg/scheduler/plugins/proportion/reclaimable/reclaimable_test.go`	Refactored `getRequiredQuota` to return `ResourceQuantities` vector instead of `*JobRequirement`. Updated `GetBuildOverCapacityMessageForQueue` call sites to pass vectors and vector maps. Replaced direct resource field access with vector lookups.
Topology & Job Filtering `pkg/scheduler/plugins/topology/job_filtering.go`, `pkg/scheduler/plugins/topology/job_filtering_test.go`, `pkg/scheduler/plugins/elastic/elastic_test.go`, `pkg/scheduler/plugins/nodeplacement/nodespread_test.go`, `pkg/scheduler/plugins/nodeavailability/nodeavailability_test.go`, `pkg/scheduler/plugins/resourcetype/resourcetype_test.go`	Replaced `ResourceRequirements`-based resource tracking with vector-based representations. Updated test setup to use `NewGpuResourceRequirementWithGpus`, `NewResourceVectorWithValues`, and test utilities like `TestJobBasic` and `BuildJobsAndTasksMaps`.
Common Test Utilities `pkg/scheduler/test_utils/jobs_fake/jobs.go`, `pkg/scheduler/test_utils/test_utils.go`, `pkg/scheduler/test_utils/test_utils_builder.go`	Updated `BuildJobInfo` to remove `allocatedResource` parameter; compute `AllocatedVector` from task vectors. Updated helper functions to access GPU/CPU/Memory resources via new `GpuRequirement` and vector paths. Added `vectorMap` parameter to `CreateFakeSession`.
Solver & Scenario Utilities `pkg/scheduler/actions/common/feasible_nodes_test.go`, `pkg/scheduler/actions/common/minimal_job_comparison.go`, `pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/idle_gpus.go`, `pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/idle_gpus_test.go`, `pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/topology_aware_idle_gpus.go`, `pkg/scheduler/actions/common/solvers/pod_scenario_builder_test.go`, `pkg/scheduler/actions/utils/job_order_by_queue_test.go`	Updated test literals and method calls to use `GpuRequirement` fields. Changed GPU quota access from `task.ResReqVector.Get(GPUIndex)` to `task.GpuRequirement.GetGpusQuota()`. Updated return types and vector aggregation patterns.
Unscheduling & Quota Messages `pkg/scheduler/api/unschedule_info.go`, `pkg/scheduler/api/unschedule_info_test.go`	Refactored `GetBuildOverCapacityMessageForQueue` signature to accept `ResourceVector` and `ResourceVectorMap` instead of `JobRequirement`. Updated helper functions to extract per-resource values via vector lookups.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

🐰 Resources once tangled in generic requirement,
Now split with GPU precision—a fine refinement!
Vectors dance where scalars dwelt before,
From ResReq to GpuRequirement, we explore.
The scheduler's heart beats cleaner, more true,
With separated concerns shining bright and new! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 4.23% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: removing Resource fields from PodInfo and PodGroupInfo structures while refactoring to use vector-based resource representation.
Description check	✅ Passed	The pull request description covers the main objectives: removing Resource-based fields from PodInfo and PodGroupInfo, adding new GPU-focused fields, removing JobRequirement, and converting to vector-based resource operations. The description aligns with the template structure and provides sufficient context.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch erez/remove-resource-fields-pod-info

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

enoodle · 2026-03-18T09:34:07Z

@coderabbitai review

coderabbitai · 2026-03-18T09:34:12Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

pkg/scheduler/api/node_info/node_info.go (1)

311-315: ⚠️ Potential issue | 🟠 Major

Reject GPU-memory requests on DRA-only nodes too.

GpuRequirement.GPUs() > 0 is not a complete proxy for device-plugin GPU usage. GPU-memory requests are tracked separately in pod_info.PodInfo, so they can skip this branch and still be scheduled onto DRA-only nodes.

Suggested fix

-	if task.GpuRequirement.GPUs() > 0 && ni.HasDRAGPUs {
+	if ni.HasDRAGPUs && task.IsRequireAnyKindOfGPU() && task.GpuRequirement.GetDraGpusCount() == 0 {
 		log.InfraLogger.V(4).Infof("Task %s/%s rejected on node %s: device-plugin GPU request on DRA-only node",
 			task.Namespace, task.Name, ni.Name)
 		return common_info.NewFitError(task.Name, task.Namespace, ni.Name,
 			"device-plugin GPU requests cannot be scheduled on DRA-only nodes")
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/scheduler/api/node_info/node_info.go` around lines 311 - 315, The current
rejection only checks task.GpuRequirement.GPUs() and misses GPU-memory-only
requests tracked on pod_info.PodInfo, so update the condition that uses
ni.HasDRAGPUs to also detect GPU-memory requests from the pod info; e.g., change
the if in node_info.go that reads "if task.GpuRequirement.GPUs() > 0 &&
ni.HasDRAGPUs" to check "if (task.GpuRequirement.GPUs() > 0 ||
task.PodInfo.HasGPUMemoryRequest() /* or task.PodInfo.GPUMemoryRequest > 0 */)
&& ni.HasDRAGPUs" (use the actual PodInfo field or helper in pod_info.PodInfo
present in the codebase) so device-plugin GPU requests and GPU-memory-only
requests are both rejected on DRA-only nodes and the same NewFitError message is
returned.

🧹 Nitpick comments (7)

pkg/scheduler/plugins/topology/job_filtering_test.go (1)

2181-2194: Prefer a per-test ResourceVectorMap instead of the shared global.

Using testVectorMap here couples table cases through shared mutable state and can make tests order-dependent.

♻️ Suggested refactor

-			jobsInfoMap, _, _ := jobs_fake.BuildJobsAndTasksMaps(
-				[]*jobs_fake.TestJobBasic{tt.job}, testVectorMap)
+			vectorMap := resource_info.NewResourceVectorMap()
+			jobsInfoMap, _, _ := jobs_fake.BuildJobsAndTasksMaps(
+				[]*jobs_fake.TestJobBasic{tt.job}, vectorMap)
 			job := jobsInfoMap[common_info.PodGroupID(tt.job.Name)]
@@
-			result, err := plugin.getJobAllocatableDomains(job, &job.RootSubGroupSet.SubGroupInfo,
-				job.RootSubGroupSet.GetAllPodSets(), tasksResources.ToVector(testVectorMap), tasksCount,
+			result, err := plugin.getJobAllocatableDomains(job, &job.RootSubGroupSet.SubGroupInfo,
+				job.RootSubGroupSet.GetAllPodSets(), tasksResources.ToVector(vectorMap), tasksCount,
 				tt.topologyTree)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/scheduler/plugins/topology/job_filtering_test.go` around lines 2181 -
2194, The test uses the shared mutable testVectorMap which can couple test
cases; make each subtest create its own ResourceVectorMap and use that instance
when building jobs/tasks and converting tasksResources to a vector.
Specifically, in the table-driven test replace the shared testVectorMap with a
per-test variable (e.g., localVectorMap) before calling
jobs_fake.BuildJobsAndTasksMaps, pass localVectorMap into
podgroup_info.GetTasksToAllocate/any other helpers that need the map, and call
tasksResources.ToVector(localVectorMap) when invoking
plugin.getJobAllocatableDomains so each test runs with its own ResourceVectorMap
and no shared state.

pkg/scheduler/cache/cache.go (1)

273-274: Log wording is now misleading for the provided payload.

The message says “requires … GPUs” but logs ResReqVector, which is not GPU-only and can be hard to interpret in triage.

💡 Proposed log cleanup

- "Creating bind request for task <%v/%v> to node <%v> gpuGroup: <%v>, requires: <%v> GPUs",
- taskInfo.Namespace, taskInfo.Name, hostname, taskInfo.GPUGroups, taskInfo.ResReqVector)
+ "Creating bind request for task <%v/%v> to node <%v> gpuGroup: <%v>, requires GPU quota: <%v>",
+ taskInfo.Namespace, taskInfo.Name, hostname, taskInfo.GPUGroups, taskInfo.GpuRequirement.GetGpusQuota())

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/scheduler/cache/cache.go` around lines 273 - 274, The log string is
misleading because it claims "requires ... GPUs" while printing
taskInfo.ResReqVector (a full resource vector); update the format string used
when creating the bind request log to reflect the actual payload by either
printing a clear label for the resource vector (e.g., "resReqVector: <%v>") or
by separately logging GPU count from taskInfo.GPUGroups (e.g., "gpuGroups: <%v>,
resReqVector: <%v>")—locate the log invocation that uses taskInfo.Namespace,
taskInfo.Name, hostname, taskInfo.GPUGroups, and taskInfo.ResReqVector and
change the message text to accurately describe those fields.

pkg/scheduler/plugins/resourcetype/resourcetype_test.go (1)

119-125: Initialize vector fields in DRA task fixture for future-proofing.

createFakeTaskWithDRA currently leaves ResReqVector/VectorMap unset. If scoring logic later touches vector paths, this helper can become brittle.

♻️ Proposed fixture hardening

 func createFakeTaskWithDRA(taskName string, draGpuCount int64) *pod_info.PodInfo {
 	gpuReq := resource_info.NewGpuResourceRequirement()
 	gpuReq.SetDraGpus(map[string]int64{"nvidia.com/gpu": draGpuCount})
+	reqVector := resource_info.NewResourceVectorWithValues(0, 0, gpuReq.GetGpusQuota(), testVectorMap)
 	return &pod_info.PodInfo{
 		Name:           taskName,
 		GpuRequirement: *gpuReq,
+		ResReqVector:   reqVector,
+		VectorMap:      testVectorMap,
 		Pod: &v1.Pod{
 			ObjectMeta: metav1.ObjectMeta{
 				CreationTimestamp: metav1.Now(),
 			},
 		},
 	}
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/scheduler/plugins/resourcetype/resourcetype_test.go` around lines 119 -
125, createFakeTaskWithDRA currently returns a PodInfo whose GpuRequirement
leaves vector paths nil; to harden the fixture initialize the vector fields
before returning: on the gpuReq (created by
resource_info.NewGpuResourceRequirement()) set its ResReqVector to an empty/new
ResReqVector instance and its VectorMap to an empty map (using the appropriate
types from resource_info) so ResReqVector/VectorMap are non-nil in the returned
pod_info.PodInfo; update createFakeTaskWithDRA accordingly.

pkg/scheduler/plugins/proportion/capacity_policy/quota_check.go (1)

86-92: Build the over-capacity vector from requestedQuota, not three hard-coded resources.

isAllocatedNonPreemptibleOverQuota already reasons over rs.AllResources, but this message path rebuilds vec with only GPU/CPU/memory. A small helper that populates the vector from the map would keep the error text from drifting the next time queue-tracked resources change.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/scheduler/plugins/proportion/capacity_policy/quota_check.go` around lines
86 - 92, The over-capacity vector is being constructed only for GPU/CPU/memory
instead of using the full requestedQuota map; change the vec build in the block
that calls GetBuildOverCapacityMessageForQueue to iterate over rs.AllResources
(or the keys of requestedQuota) and for each resource call
vec.Set(vectorMap.GetIndex(resourceName), requestedQuota[resourceName]) so the
vector reflects all tracked resources; keep using
resource_info.NewResourceVectorMap() and NewResourceVector(...) and pass that
populated vec and vectorMap into api.GetBuildOverCapacityMessageForQueue.

pkg/scheduler/api/resource_info/resource_info.go (2)

158-180: Consider using the GpuResource constant instead of hardcoded "gpu" string.

Line 161 uses a hardcoded "gpu" string for vectorMap.GetIndex("gpu"). For consistency with the rest of the codebase, consider using the commonconstants.GpuResource constant (as seen in other files like gpu_sharing_node_info.go).

Additionally, the conversion from float64 to int64 at line 174 may silently truncate fractional scalar resource values. If fractional quantities are valid for some scalar resources, this could cause precision loss.

Suggested improvements

 func (r *Resource) AddVectorAndGpuReq(vec ResourceVector, vectorMap *ResourceVectorMap, gpuReq *GpuResourceRequirement) {
 	cpuIdx := vectorMap.GetIndex(v1.ResourceCPU)
 	memIdx := vectorMap.GetIndex(v1.ResourceMemory)
-	gpuIdx := vectorMap.GetIndex("gpu")
+	gpuIdx := vectorMap.GetIndex(commonconstants.GpuResource)
 
 	r.milliCpu += vec.Get(cpuIdx)
 	r.memory += vec.Get(memIdx)
 	r.gpus += gpuReq.GPUs()
 
 	for i := 0; i < vectorMap.Len(); i++ {
 		if i == cpuIdx || i == memIdx || i == gpuIdx {
 			continue
 		}
 		val := vec.Get(i)
 		if val != 0 {
 			rName := v1.ResourceName(vectorMap.ResourceAt(i))
-			r.BaseResource.scalarResources[rName] += int64(val)
+			r.BaseResource.scalarResources[rName] += int64(math.Round(val))
 		}
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/scheduler/api/resource_info/resource_info.go` around lines 158 - 180,
Replace the hardcoded "gpu" string in AddVectorAndGpuReq by using the shared
constant commonconstants.GpuResource when calling vectorMap.GetIndex to match
the rest of the codebase; also address the float64→int64 conversion when writing
to r.BaseResource.scalarResources in AddVectorAndGpuReq by either converting
with a rounding step (e.g., int64(math.Round(val))) or changing the
scalarResources storage type to preserve fractional values, and update
imports/usages accordingly (referencing AddVectorAndGpuReq,
ResourceVectorMap.GetIndex, r.BaseResource.scalarResources, and GpuResource
constant).

193-215: Same concerns apply as in AddVectorAndGpuReq.

The hardcoded "gpu" string at line 196 and the float64 to int64 conversion at line 209 have the same concerns as noted for AddVectorAndGpuReq. Consider applying the same fixes for consistency.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/scheduler/api/resource_info/resource_info.go` around lines 193 - 215,
SubVectorAndGpuReq repeats the same issues as AddVectorAndGpuReq: a hardcoded
"gpu" resource name and an unsafe float64→int64 conversion when subtracting
scalar resources. Replace the literal "gpu" index lookup with the same canonical
resource identifier used in AddVectorAndGpuReq (use the vectorMap lookup for the
canonical GPU resource constant) and when subtracting vec.Get(i) from
r.BaseResource.scalarResources[rName] convert the float value to an integer
safely (use the same numeric conversion used in AddVectorAndGpuReq, e.g.,
rounding or explicit integer-safe conversion) so you don’t truncate/corrupt
resource counts; keep references to SubVectorAndGpuReq, AddVectorAndGpuReq,
vectorMap.GetIndex, vectorMap.ResourceAt and r.BaseResource.scalarResources to
locate the changes.

pkg/scheduler/api/unschedule_info.go (1)

20-57: Consider using a constant for "gpu" resource name.

The function correctly migrates to vector-based resource access. However, there's an inconsistency in how resource names are accessed:

Line 33: uses hardcoded "gpu" string
Line 41: uses v1.ResourceCPU constant
Line 49: uses v1.ResourceMemory constant

For consistency and maintainability, consider using a constant for the GPU resource name (e.g., commonconstants.GpuResource or constants.GpuResource as used elsewhere in the codebase).

Suggested fix

+import commonconstants "github.com/kai-scheduler/KAI-scheduler/pkg/common/constants"
+
 func getOverCapacityMessageDetails(queueName, resourceName string, deserved, used float64,
 	requestedResources resource_info.ResourceVector, vectorMap *resource_info.ResourceVectorMap) string {
 	switch resourceName {
 	case GpuResource:
 		return fmt.Sprintf("Workload requested %v GPUs, but %s quota is %v GPUs, "+
 			"while %v GPUs are already allocated for non-preemptible pods.",
-			requestedResources.Get(vectorMap.GetIndex("gpu")),
+			requestedResources.Get(vectorMap.GetIndex(commonconstants.GpuResource)),

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/scheduler/api/unschedule_info.go` around lines 20 - 57, The code mixes a
hardcoded "gpu" string with constants for CPU/Memory in
getOverCapacityMessageDetails/GetBuildOverCapacityMessageForQueue; replace the
literal "gpu" passed to vectorMap.GetIndex with the project constant used
elsewhere (e.g., commonconstants.GpuResource or the repo's GpuResource constant)
so GPU lookups match v1.ResourceCPU/v1.ResourceMemory usage and are
maintainable; update any imports/usages referenced by
getOverCapacityMessageDetails and GetBuildOverCapacityMessageForQueue to use
that constant.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/scheduler/api/node_info/node_info.go`:
- Around line 688-695: GetRequiredInitQuota is overwriting the total GPU slot in
result with a per-device fractional value, which reduces multi-GPU requests;
instead only replace the GPU entry when the request is a pure GPU-memory request
(i.e., no explicit GPU count in pi.ResReqVector but only memory-based
requirement). Modify GetRequiredInitQuota to: obtain gpuIdx via
ni.VectorMap.GetIndex("gpu"), detect if pi.GpuRequirement is memory-only (e.g.,
MigResources() non-empty logic inverted or check whether pi.ResReqVector already
contains a GPU count), and only call result.Set(gpuIdx,
ni.getGpuMemoryFractionalOnNode(ni.GetResourceGpuMemory(&pi.GpuRequirement)))
for pure GPU-memory cases; otherwise leave the existing value from
pi.ResReqVector untouched so multi-device totals (2, 3 x 0.5, etc.) are
preserved.

In `@pkg/scheduler/api/pod_info/pod_info_test.go`:
- Around line 466-475: The test fixture creates a PodInfo without ResReqVector
or AcceptedResourceVector, but updatePodAdditionalFields() now rebuilds
ResReqVector at the end; to mirror production and catch regressions, initialize
those vectors in the fixture: after constructing the PodInfo (pi :=
&PodInfo{...}), either call updatePodAdditionalFields(pi) or explicitly populate
pi.ResReqVector and pi.AcceptedResourceVector (derived from pi.GpuRequirement
and vectorMap) so the test uses the same vector state as the real code.

In `@pkg/scheduler/api/pod_info/pod_info.go`:
- Around line 261-265: SetVectorMap currently only swaps pi.VectorMap leaving
pi.ResReqVector and pi.AcceptedResourceVector using the old map indices; update
SetVectorMap to rebuild both vectors to match the new map: use the new vectorMap
to re-create pi.ResReqVector from the pod's resource request fields (so GetIndex
lookups map correctly) and reinitialize or remap pi.AcceptedResourceVector to
the new length/indices (preserving any known accepted values if available,
otherwise zeroing), ensuring PodInfo.SetVectorMap and PodGroupInfo.SetVectorMap
operate safely with the new map.

In `@pkg/scheduler/api/podgroup_info/job_info.go`:
- Around line 455-463: SetVectorMap currently replaces pgi.VectorMap and
rebuilds pgi.AllocatedVector but still sums the tasks' existing ResReqVector
values which may be bound to the old map; iterate pgi.GetAllPodsMap() and for
each task rebind or project its ResReqVector onto the new vectorMap (e.g. call
the vector rebinding/projection helper used elsewhere in the codebase) before
calling pgi.AllocatedVector.Add(...), so AllocatedVector is computed from
vectors that are consistent with pgi.VectorMap; update SetVectorMap to perform
this rebind/projection step for each task prior to aggregation.

In `@pkg/scheduler/plugins/proportion/capacity_policy/capacity_policy.go`:
- Around line 50-52: The slice access using node.VectorMap.GetIndex(...) on
requiredInitQuota is unsafe because GetIndex can return -1 (e.g., GPU may be
absent) and direct indexing causes panics; update the code that builds the tuple
using requiredInitQuota[node.VectorMap.GetIndex(v1.ResourceCPU)],
requiredInitQuota[node.VectorMap.GetIndex(v1.ResourceMemory)],
requiredInitQuota[node.VectorMap.GetIndex(constants.GpuResource)] to use safe
lookups instead — either call the VectorMap.Get(index) accessor that handles
bounds or guard the index (check GetIndex(...) >= 0) before indexing, mirroring
the pattern in GetRequiredInitQuota which uses .Set/checked access; ensure you
reference GetIndex, VectorMap, requiredInitQuota, v1.ResourceCPU,
v1.ResourceMemory, and constants.GpuResource when making the change.

In `@pkg/scheduler/plugins/topology/job_filtering_test.go`:
- Around line 2205-2211: The test currently reports a length mismatch but still
indexes job.JobFitErrors[i], which can panic when actual fit errors are shorter
than expected; update the validation in the test (job_filtering_test.go) to
guard access to job.JobFitErrors before indexing—either iterate only up to
min(len(job.JobFitErrors), len(tt.expectedFitErrors)) or skip per-item
comparisons when the lengths differ (e.g., continue after the initial length
check) so accesses like job.JobFitErrors[i] are never out-of-bounds.

---

Outside diff comments:
In `@pkg/scheduler/api/node_info/node_info.go`:
- Around line 311-315: The current rejection only checks
task.GpuRequirement.GPUs() and misses GPU-memory-only requests tracked on
pod_info.PodInfo, so update the condition that uses ni.HasDRAGPUs to also detect
GPU-memory requests from the pod info; e.g., change the if in node_info.go that
reads "if task.GpuRequirement.GPUs() > 0 && ni.HasDRAGPUs" to check "if
(task.GpuRequirement.GPUs() > 0 || task.PodInfo.HasGPUMemoryRequest() /* or
task.PodInfo.GPUMemoryRequest > 0 */) && ni.HasDRAGPUs" (use the actual PodInfo
field or helper in pod_info.PodInfo present in the codebase) so device-plugin
GPU requests and GPU-memory-only requests are both rejected on DRA-only nodes
and the same NewFitError message is returned.

---

Nitpick comments:
In `@pkg/scheduler/api/resource_info/resource_info.go`:
- Around line 158-180: Replace the hardcoded "gpu" string in AddVectorAndGpuReq
by using the shared constant commonconstants.GpuResource when calling
vectorMap.GetIndex to match the rest of the codebase; also address the
float64→int64 conversion when writing to r.BaseResource.scalarResources in
AddVectorAndGpuReq by either converting with a rounding step (e.g.,
int64(math.Round(val))) or changing the scalarResources storage type to preserve
fractional values, and update imports/usages accordingly (referencing
AddVectorAndGpuReq, ResourceVectorMap.GetIndex, r.BaseResource.scalarResources,
and GpuResource constant).
- Around line 193-215: SubVectorAndGpuReq repeats the same issues as
AddVectorAndGpuReq: a hardcoded "gpu" resource name and an unsafe float64→int64
conversion when subtracting scalar resources. Replace the literal "gpu" index
lookup with the same canonical resource identifier used in AddVectorAndGpuReq
(use the vectorMap lookup for the canonical GPU resource constant) and when
subtracting vec.Get(i) from r.BaseResource.scalarResources[rName] convert the
float value to an integer safely (use the same numeric conversion used in
AddVectorAndGpuReq, e.g., rounding or explicit integer-safe conversion) so you
don’t truncate/corrupt resource counts; keep references to SubVectorAndGpuReq,
AddVectorAndGpuReq, vectorMap.GetIndex, vectorMap.ResourceAt and
r.BaseResource.scalarResources to locate the changes.

In `@pkg/scheduler/api/unschedule_info.go`:
- Around line 20-57: The code mixes a hardcoded "gpu" string with constants for
CPU/Memory in getOverCapacityMessageDetails/GetBuildOverCapacityMessageForQueue;
replace the literal "gpu" passed to vectorMap.GetIndex with the project constant
used elsewhere (e.g., commonconstants.GpuResource or the repo's GpuResource
constant) so GPU lookups match v1.ResourceCPU/v1.ResourceMemory usage and are
maintainable; update any imports/usages referenced by
getOverCapacityMessageDetails and GetBuildOverCapacityMessageForQueue to use
that constant.

In `@pkg/scheduler/cache/cache.go`:
- Around line 273-274: The log string is misleading because it claims "requires
... GPUs" while printing taskInfo.ResReqVector (a full resource vector); update
the format string used when creating the bind request log to reflect the actual
payload by either printing a clear label for the resource vector (e.g.,
"resReqVector: <%v>") or by separately logging GPU count from taskInfo.GPUGroups
(e.g., "gpuGroups: <%v>, resReqVector: <%v>")—locate the log invocation that
uses taskInfo.Namespace, taskInfo.Name, hostname, taskInfo.GPUGroups, and
taskInfo.ResReqVector and change the message text to accurately describe those
fields.

In `@pkg/scheduler/plugins/proportion/capacity_policy/quota_check.go`:
- Around line 86-92: The over-capacity vector is being constructed only for
GPU/CPU/memory instead of using the full requestedQuota map; change the vec
build in the block that calls GetBuildOverCapacityMessageForQueue to iterate
over rs.AllResources (or the keys of requestedQuota) and for each resource call
vec.Set(vectorMap.GetIndex(resourceName), requestedQuota[resourceName]) so the
vector reflects all tracked resources; keep using
resource_info.NewResourceVectorMap() and NewResourceVector(...) and pass that
populated vec and vectorMap into api.GetBuildOverCapacityMessageForQueue.

In `@pkg/scheduler/plugins/resourcetype/resourcetype_test.go`:
- Around line 119-125: createFakeTaskWithDRA currently returns a PodInfo whose
GpuRequirement leaves vector paths nil; to harden the fixture initialize the
vector fields before returning: on the gpuReq (created by
resource_info.NewGpuResourceRequirement()) set its ResReqVector to an empty/new
ResReqVector instance and its VectorMap to an empty map (using the appropriate
types from resource_info) so ResReqVector/VectorMap are non-nil in the returned
pod_info.PodInfo; update createFakeTaskWithDRA accordingly.

In `@pkg/scheduler/plugins/topology/job_filtering_test.go`:
- Around line 2181-2194: The test uses the shared mutable testVectorMap which
can couple test cases; make each subtest create its own ResourceVectorMap and
use that instance when building jobs/tasks and converting tasksResources to a
vector. Specifically, in the table-driven test replace the shared testVectorMap
with a per-test variable (e.g., localVectorMap) before calling
jobs_fake.BuildJobsAndTasksMaps, pass localVectorMap into
podgroup_info.GetTasksToAllocate/any other helpers that need the map, and call
tasksResources.ToVector(localVectorMap) when invoking
plugin.getJobAllocatableDomains so each test runs with its own ResourceVectorMap
and no shared state.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9b0f5e4a-f563-46b6-bded-124627a7c032

📥 Commits

Reviewing files that changed from the base of the PR and between 200344c and 5add9ad.

📒 Files selected for processing (47)

pkg/scheduler/actions/common/feasible_nodes_test.go
pkg/scheduler/actions/common/minimal_job_comparison.go
pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/idle_gpus.go
pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/idle_gpus_test.go
pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/topology_aware_idle_gpus.go
pkg/scheduler/actions/common/solvers/pod_scenario_builder_test.go
pkg/scheduler/actions/utils/job_order_by_queue_test.go
pkg/scheduler/api/common_info/pod_errors.go
pkg/scheduler/api/common_info/pod_errors_test.go
pkg/scheduler/api/node_info/gpu_sharing_node_info.go
pkg/scheduler/api/node_info/node_info.go
pkg/scheduler/api/pod_info/pod_info.go
pkg/scheduler/api/pod_info/pod_info_benchmark_test.go
pkg/scheduler/api/pod_info/pod_info_test.go
pkg/scheduler/api/podgroup_info/allocation_info.go
pkg/scheduler/api/podgroup_info/allocation_info_test.go
pkg/scheduler/api/podgroup_info/job_info.go
pkg/scheduler/api/podgroup_info/job_info_test.go
pkg/scheduler/api/resource_info/gpu_resource_requirment.go
pkg/scheduler/api/resource_info/resource_info.go
pkg/scheduler/api/unschedule_info.go
pkg/scheduler/api/unschedule_info_test.go
pkg/scheduler/cache/cache.go
pkg/scheduler/framework/session.go
pkg/scheduler/framework/statement_checkpoint_test.go
pkg/scheduler/framework/statement_test.go
pkg/scheduler/framework/statement_test_utils.go
pkg/scheduler/gpu_sharing/gpuSharing.go
pkg/scheduler/k8s_internal/predicates/maxNodeResources.go
pkg/scheduler/plugins/elastic/elastic_test.go
pkg/scheduler/plugins/gpusharingorder/gpusharingorder.go
pkg/scheduler/plugins/nodeavailability/nodeavailability_test.go
pkg/scheduler/plugins/nodeplacement/nodespread_test.go
pkg/scheduler/plugins/predicates/predicates.go
pkg/scheduler/plugins/proportion/capacity_policy/capacity_policy.go
pkg/scheduler/plugins/proportion/capacity_policy/capacity_policy_test.go
pkg/scheduler/plugins/proportion/capacity_policy/quota_check.go
pkg/scheduler/plugins/proportion/proportion.go
pkg/scheduler/plugins/proportion/proportion_test.go
pkg/scheduler/plugins/proportion/queue_order/queue_order_test.go
pkg/scheduler/plugins/proportion/reclaimable/reclaimable_test.go
pkg/scheduler/plugins/resourcetype/resourcetype_test.go
pkg/scheduler/plugins/topology/job_filtering.go
pkg/scheduler/plugins/topology/job_filtering_test.go
pkg/scheduler/test_utils/jobs_fake/jobs.go
pkg/scheduler/test_utils/test_utils.go
pkg/scheduler/test_utils/test_utils_builder.go

💤 Files with no reviewable changes (1)

pkg/scheduler/plugins/proportion/proportion_test.go

github-actions · 2026-03-18T10:07:16Z

📊 Performance Benchmark Results

Comparing PR (erez/remove-resource-fields-pod-info) vs main branch — click to expand

goos: linux
goarch: amd64
pkg: github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions
cpu: AMD EPYC 7763 64-Core Processor                
                                    │ main-bench.txt │           pr-bench.txt            │
                                    │     sec/op     │   sec/op     vs base              │
AllocateAction_SmallCluster-4            108.1m ± 0%   107.9m ± 4%       ~ (p=0.589 n=6)
AllocateAction_MediumCluster-4           135.9m ± 1%   135.1m ± 1%       ~ (p=0.132 n=6)
AllocateAction_LargeCluster-4            210.9m ± 8%   208.7m ± 8%       ~ (p=0.132 n=6)
ReclaimAction_SmallCluster-4             102.7m ± 0%   102.7m ± 0%       ~ (p=0.699 n=6)
ReclaimAction_MediumCluster-4            105.4m ± 1%   105.3m ± 1%  -0.08% (p=0.041 n=6)
PreemptAction_SmallCluster-4             103.4m ± 0%   103.4m ± 0%       ~ (p=0.132 n=6)
PreemptAction_MediumCluster-4            112.7m ± 0%   112.8m ± 1%       ~ (p=0.132 n=6)
ConsolidationAction_SmallCluster-4       114.1m ± 1%   113.7m ± 1%       ~ (p=0.310 n=6)
ConsolidationAction_MediumCluster-4      202.7m ± 1%   200.5m ± 1%  -1.05% (p=0.041 n=6)
FullSchedulingCycle_SmallCluster-4       105.2m ± 0%   105.0m ± 0%  -0.19% (p=0.026 n=6)
FullSchedulingCycle_MediumCluster-4      121.5m ± 1%   119.8m ± 1%  -1.43% (p=0.009 n=6)
FullSchedulingCycle_LargeCluster-4       161.9m ± 1%   160.2m ± 2%       ~ (p=0.132 n=6)
ManyQueues_MediumCluster-4               138.6m ± 4%   139.1m ± 1%       ~ (p=0.937 n=6)
GangScheduling_MediumCluster-4           160.0m ± 1%   159.1m ± 1%  -0.55% (p=0.041 n=6)
geomean                                  130.5m        129.9m       -0.42%

                                    │ main-bench.txt │            pr-bench.txt            │
                                    │      B/op      │     B/op      vs base              │
AllocateAction_SmallCluster-4           2.255Mi ± 0%   2.158Mi ± 1%  -4.30% (p=0.002 n=6)
AllocateAction_MediumCluster-4          12.19Mi ± 0%   11.78Mi ± 0%  -3.30% (p=0.002 n=6)
AllocateAction_LargeCluster-4           42.02Mi ± 0%   41.01Mi ± 0%  -2.40% (p=0.002 n=6)
ReclaimAction_SmallCluster-4            955.3Ki ± 1%   886.2Ki ± 1%  -7.24% (p=0.002 n=6)
ReclaimAction_MediumCluster-4           3.159Mi ± 0%   2.886Mi ± 0%  -8.64% (p=0.002 n=6)
PreemptAction_SmallCluster-4            1.060Mi ± 1%   1.013Mi ± 0%  -4.45% (p=0.002 n=6)
PreemptAction_MediumCluster-4           4.310Mi ± 0%   4.115Mi ± 0%  -4.52% (p=0.002 n=6)
ConsolidationAction_SmallCluster-4      5.557Mi ± 0%   5.408Mi ± 0%  -2.68% (p=0.002 n=6)
ConsolidationAction_MediumCluster-4     46.71Mi ± 0%   46.13Mi ± 0%  -1.24% (p=0.002 n=6)
FullSchedulingCycle_SmallCluster-4      1.454Mi ± 0%   1.370Mi ± 0%  -5.79% (p=0.002 n=6)
FullSchedulingCycle_MediumCluster-4     7.173Mi ± 0%   6.834Mi ± 0%  -4.73% (p=0.002 n=6)
FullSchedulingCycle_LargeCluster-4      23.48Mi ± 0%   22.64Mi ± 0%  -3.61% (p=0.002 n=6)
ManyQueues_MediumCluster-4              16.64Mi ± 0%   16.24Mi ± 0%  -2.42% (p=0.002 n=6)
GangScheduling_MediumCluster-4          17.77Mi ± 0%   17.06Mi ± 0%  -4.03% (p=0.002 n=6)
geomean                                 6.592Mi        6.311Mi       -4.26%

                                    │ main-bench.txt │            pr-bench.txt            │
                                    │   allocs/op    │  allocs/op   vs base               │
AllocateAction_SmallCluster-4            36.32k ± 0%   34.87k ± 0%   -4.00% (p=0.002 n=6)
AllocateAction_MediumCluster-4           317.9k ± 0%   312.1k ± 0%   -1.82% (p=0.002 n=6)
AllocateAction_LargeCluster-4            1.351M ± 0%   1.337M ± 0%   -1.07% (p=0.002 n=6)
ReclaimAction_SmallCluster-4             8.960k ± 0%   8.008k ± 0%  -10.62% (p=0.002 n=6)
ReclaimAction_MediumCluster-4            29.06k ± 0%   25.26k ± 0%  -13.08% (p=0.002 n=6)
PreemptAction_SmallCluster-4             11.68k ± 0%   10.88k ± 0%   -6.85% (p=0.002 n=6)
PreemptAction_MediumCluster-4            41.00k ± 0%   37.80k ± 0%   -7.81% (p=0.002 n=6)
ConsolidationAction_SmallCluster-4       71.64k ± 0%   69.43k ± 0%   -3.08% (p=0.002 n=6)
ConsolidationAction_MediumCluster-4      675.7k ± 0%   667.0k ± 0%   -1.30% (p=0.002 n=6)
FullSchedulingCycle_SmallCluster-4       21.70k ± 0%   20.50k ± 0%   -5.54% (p=0.002 n=6)
FullSchedulingCycle_MediumCluster-4      172.3k ± 0%   167.5k ± 0%   -2.79% (p=0.002 n=6)
FullSchedulingCycle_LargeCluster-4       708.8k ± 0%   696.8k ± 0%   -1.69% (p=0.002 n=6)
ManyQueues_MediumCluster-4               355.7k ± 0%   350.0k ± 0%   -1.63% (p=0.002 n=6)
GangScheduling_MediumCluster-4           581.7k ± 0%   571.3k ± 0%   -1.79% (p=0.002 n=6)
geomean                                  112.4k        107.3k        -4.58%

pkg: github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/integration_tests/reclaim
                            │ main-bench.txt │           pr-bench.txt            │
                            │     sec/op     │   sec/op     vs base              │
ReclaimLargeJobs_10Node-4        104.4m ± 0%   104.2m ± 0%  -0.22% (p=0.026 n=6)
ReclaimLargeJobs_50Node-4        139.8m ± 1%   137.8m ± 1%  -1.42% (p=0.002 n=6)
ReclaimLargeJobs_100Node-4       272.2m ± 7%   266.3m ± 1%  -2.16% (p=0.002 n=6)
ReclaimLargeJobs_200Node-4        1.071 ± 8%    1.059 ± 4%       ~ (p=0.485 n=6)
ReclaimLargeJobs_500Node-4        12.71 ± 1%    12.60 ± 1%  -0.86% (p=0.009 n=6)
ReclaimLargeJobs_1000Node-4       108.5 ± 0%    109.8 ± 0%  +1.18% (p=0.002 n=6)
geomean                           1.343         1.333       -0.78%

                            │ main-bench.txt │            pr-bench.txt             │
                            │      B/op      │     B/op      vs base               │
ReclaimLargeJobs_10Node-4       1.870Mi ± 3%   1.689Mi ± 3%   -9.68% (p=0.002 n=6)
ReclaimLargeJobs_50Node-4       17.59Mi ± 0%   15.34Mi ± 0%  -12.78% (p=0.002 n=6)
ReclaimLargeJobs_100Node-4      59.68Mi ± 0%   51.79Mi ± 0%  -13.22% (p=0.002 n=6)
ReclaimLargeJobs_200Node-4      233.8Mi ± 0%   204.5Mi ± 0%  -12.54% (p=0.002 n=6)
ReclaimLargeJobs_500Node-4      1.658Gi ± 0%   1.487Gi ± 0%  -10.29% (p=0.002 n=6)
ReclaimLargeJobs_1000Node-4     8.527Gi ± 0%   7.855Gi ± 0%   -7.88% (p=0.002 n=6)
geomean                         137.7Mi        122.4Mi       -11.09%

                            │ main-bench.txt │            pr-bench.txt            │
                            │   allocs/op    │  allocs/op   vs base               │
ReclaimLargeJobs_10Node-4        20.34k ± 2%   18.01k ± 3%  -11.47% (p=0.002 n=6)
ReclaimLargeJobs_50Node-4        234.3k ± 0%   207.2k ± 0%  -11.57% (p=0.002 n=6)
ReclaimLargeJobs_100Node-4       872.5k ± 0%   779.5k ± 0%  -10.65% (p=0.002 n=6)
ReclaimLargeJobs_200Node-4       3.691M ± 0%   3.350M ± 0%   -9.23% (p=0.002 n=6)
ReclaimLargeJobs_500Node-4       29.66M ± 0%   27.65M ± 0%   -6.78% (p=0.002 n=6)
ReclaimLargeJobs_1000Node-4      165.8M ± 0%   157.9M ± 0%   -4.76% (p=0.002 n=6)
geomean                          2.056M        1.869M        -9.11%

Legend

📉 Negative delta = Performance improvement (faster)
📈 Positive delta = Performance regression (slower)
p-value < 0.05 indicates statistically significant change

Raw benchmark data

PR branch:

goos: linux
goarch: amd64
pkg: github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions
cpu: AMD EPYC 7763 64-Core Processor                
BenchmarkAllocateAction_SmallCluster-4         	       9	 112358803 ns/op	 2276767 B/op	   34885 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 107662750 ns/op	 2256526 B/op	   34871 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 107632976 ns/op	 2269706 B/op	   34865 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 107841270 ns/op	 2268520 B/op	   34864 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 108029156 ns/op	 2255015 B/op	   34866 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 108139371 ns/op	 2254356 B/op	   34861 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 134068201 ns/op	12355964 B/op	  312105 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 134082217 ns/op	12356688 B/op	  312110 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 135003888 ns/op	12356265 B/op	  312091 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 135244479 ns/op	12355315 B/op	  312099 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 135631396 ns/op	12356055 B/op	  312105 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 136162809 ns/op	12356531 B/op	  312105 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 208976755 ns/op	43006764 B/op	 1336581 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 207834945 ns/op	43011672 B/op	 1336583 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 208230770 ns/op	43005851 B/op	 1336575 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 211014504 ns/op	43007121 B/op	 1336587 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 226181051 ns/op	43006440 B/op	 1336580 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 208367387 ns/op	43005417 B/op	 1336570 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102638072 ns/op	  898008 B/op	    7985 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102647114 ns/op	  905823 B/op	    7997 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102754902 ns/op	  909074 B/op	    8009 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102614168 ns/op	  902356 B/op	    8008 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102702229 ns/op	  910056 B/op	    8011 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102681671 ns/op	  910252 B/op	    8011 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 106014360 ns/op	 3023016 B/op	   25263 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105226669 ns/op	 3026664 B/op	   25263 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105370233 ns/op	 3031839 B/op	   25265 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105315875 ns/op	 3026668 B/op	   25263 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105238987 ns/op	 3026548 B/op	   25264 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105294992 ns/op	 3022956 B/op	   25262 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103480859 ns/op	 1058589 B/op	   10876 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103500915 ns/op	 1062205 B/op	   10876 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103507009 ns/op	 1062026 B/op	   10875 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103417013 ns/op	 1062120 B/op	   10875 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103373350 ns/op	 1062535 B/op	   10877 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103417918 ns/op	 1058664 B/op	   10876 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 113883478 ns/op	 4315076 B/op	   37796 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 113218861 ns/op	 4315088 B/op	   37795 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112767333 ns/op	 4315084 B/op	   37796 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112721832 ns/op	 4322426 B/op	   37796 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112745105 ns/op	 4314908 B/op	   37795 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112707251 ns/op	 4314554 B/op	   37793 allocs/op
BenchmarkConsolidationAction_SmallCluster-4    	       9	 113544239 ns/op	 5676189 B/op	   69465 allocs/op
BenchmarkConsolidationAction_SmallCluster-4    	       9	 113808681 ns/op	 5671195 B/op	   69448 allocs/op
BenchmarkConsolidationAction_SmallCluster-4    	       9	 112979187 ns/op	 5669668 B/op	   69419 allocs/op
BenchmarkConsolidationAction_SmallCluster-4    	       9	 114001390 ns/op	 5666239 B/op	   69403 allocs/op

Main branch:

goos: linux
goarch: amd64
pkg: github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions
cpu: AMD EPYC 7763 64-Core Processor                
BenchmarkAllocateAction_SmallCluster-4         	      10	 108022750 ns/op	 2364244 B/op	   36325 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 107857638 ns/op	 2360323 B/op	   36316 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 107982609 ns/op	 2363389 B/op	   36320 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 108124031 ns/op	 2365904 B/op	   36312 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 108208160 ns/op	 2364013 B/op	   36313 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 108081694 ns/op	 2369039 B/op	   36318 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 135797494 ns/op	12778892 B/op	  317905 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 135157656 ns/op	12777138 B/op	  317898 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 135915060 ns/op	12779623 B/op	  317899 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 136040467 ns/op	12778091 B/op	  317904 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 135810944 ns/op	12777921 B/op	  317901 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 136065248 ns/op	12777793 B/op	  317899 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 210372176 ns/op	44064486 B/op	 1351089 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 209973178 ns/op	44061728 B/op	 1351073 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 210771629 ns/op	44062017 B/op	 1351078 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 211030092 ns/op	44061976 B/op	 1351076 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 213368088 ns/op	44060809 B/op	 1351069 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 227036627 ns/op	44061998 B/op	 1351077 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102801975 ns/op	  969493 B/op	    8930 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102661529 ns/op	  977439 B/op	    8949 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102664735 ns/op	  981399 B/op	    8960 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102635502 ns/op	  978239 B/op	    8960 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102663628 ns/op	  978282 B/op	    8960 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102705025 ns/op	  978280 B/op	    8960 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 106132073 ns/op	 3310878 B/op	   29062 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105500780 ns/op	 3314639 B/op	   29064 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105401818 ns/op	 3319900 B/op	   29064 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105380777 ns/op	 3310527 B/op	   29062 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105382134 ns/op	 3314652 B/op	   29064 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105376208 ns/op	 3310653 B/op	   29062 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103399876 ns/op	 1113156 B/op	   11675 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103395645 ns/op	 1113716 B/op	   11677 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103361592 ns/op	 1113781 B/op	   11678 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103319775 ns/op	 1109900 B/op	   11676 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103259020 ns/op	 1109720 B/op	   11676 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103511196 ns/op	 1105711 B/op	   11674 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112574918 ns/op	 4519201 B/op	   40995 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112595036 ns/op	 4518880 B/op	   40995 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112822685 ns/op	 4519216 B/op	   40996 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112772320 ns/op	 4519168 B/op	   40996 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112682925 ns/op	 4514664 B/op	   40993 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112647681 ns/op	 4519267 B/op	   40996 allocs/op
BenchmarkConsolidationAction_SmallCluster-4    	       9	 113781703 ns/op	 5827702 B/op	   71650 allocs/op
BenchmarkConsolidationAction_SmallCluster-4    	       9	 115029197 ns/op	 5837613 B/op	   71630 allocs/op
BenchmarkConsolidationAction_SmallCluster-4    	       9	 114369325 ns/op	 5818559 B/op	   71612 allocs/op
BenchmarkConsolidationAction_SmallCluster-4    	       9	 113272688 ns/op	 5837874 B/op	   71630 allocs/op

github-actions · 2026-03-18T10:14:19Z

Merging this branch changes the coverage (3 decrease, 6 increase)

Impacted Packages	Coverage Δ	🤖
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common	23.11% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers	22.22% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus	86.57% (-0.06%)	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/utils	56.48% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api	27.45% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/common_info	64.55% (+2.59%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/node_info	72.00% (+0.41%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/pod_info	67.81% (+1.29%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/podgroup_info	60.19% (-0.05%)	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/resource_info	49.70% (-3.03%)	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/cache	34.25% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/framework	36.80% (+0.91%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/gpu_sharing	43.24% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/k8s_internal/predicates	66.47% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/elastic	91.67% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/gpusharingorder	0.00% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/nodeavailability	87.50% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/nodeplacement	92.00% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/predicates	71.77% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion	37.56% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/capacity_policy	98.65% (+0.04%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/queue_order	64.60% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/reclaimable	92.44% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/resourcetype	90.00% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/topology	88.21% (+0.26%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/test_utils	0.00% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/test_utils/jobs_fake	0.00% (ø)

Coverage by file

Changed files (no unit tests)

Changed File	Coverage Δ	Total	Covered	Missed	🤖
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/minimal_job_comparison.go	93.33% (ø)	45	42	3
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/idle_gpus.go	77.53% (ø)	89	69	20
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/topology_aware_idle_gpus.go	92.31% (-0.08%)	91 (-1)	84 (-1)	7	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/common_info/pod_errors.go	52.69% (+6.10%)	93 (+5)	49 (+8)	44 (-3)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/node_info/gpu_sharing_node_info.go	68.46% (ø)	149	102	47
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/node_info/node_info.go	73.75% (+0.60%)	301 (+3)	222 (+4)	79 (-1)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/pod_info/pod_info.go	64.24% (+1.90%)	165 (+3)	106 (+5)	59 (-2)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/podgroup_info/allocation_info.go	85.54% (+0.13%)	83 (-13)	71 (-11)	12 (-2)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/podgroup_info/job_info.go	42.08% (+0.57%)	202 (-10)	85 (-3)	117 (-7)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/resource_info/gpu_resource_requirment.go	61.25% (+0.76%)	80 (-1)	49	31 (-1)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/resource_info/resource_info.go	9.73% (-3.52%)	113 (+30)	11	102 (+30)	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/unschedule_info.go	46.67% (ø)	30	14	16
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/cache/cache.go	52.71% (ø)	129	68	61
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/framework/session.go	0.53% (ø)	188	1	187
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/framework/statement_test_utils.go	83.93% (+1.93%)	112 (+12)	94 (+12)	18	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/gpu_sharing/gpuSharing.go	43.24% (ø)	37	16	21
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/k8s_internal/predicates/maxNodeResources.go	92.00% (ø)	50	46	4
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/gpusharingorder/gpusharingorder.go	0.00% (ø)	10	0	10
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/predicates/predicates.go	71.77% (ø)	124	89	35
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/capacity_policy/capacity_policy.go	100.00% (ø)	24 (-2)	24 (-2)	0
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/capacity_policy/quota_check.go	96.30% (+0.64%)	27 (+4)	26 (+4)	1	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/proportion.go	37.56% (ø)	221	83	138
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/topology/job_filtering.go	94.27% (+0.19%)	279 (+9)	263 (+9)	16	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/test_utils/jobs_fake/jobs.go	0.00% (ø)	149 (+4)	0	149 (+4)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/test_utils/test_utils.go	0.00% (ø)	132	0	132
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/test_utils/test_utils_builder.go	0.00% (ø)	130	0	130

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/feasible_nodes_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/idle_gpus_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers/pod_scenario_builder_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/utils/job_order_by_queue_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/common_info/pod_errors_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/pod_info/pod_info_benchmark_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/pod_info/pod_info_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/podgroup_info/allocation_info_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/podgroup_info/job_info_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/unschedule_info_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/framework/statement_checkpoint_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/framework/statement_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/elastic/elastic_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/nodeavailability/nodeavailability_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/nodeplacement/nodespread_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/capacity_policy/capacity_policy_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/proportion_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/queue_order/queue_order_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/reclaimable/reclaimable_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/resourcetype/resourcetype_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/topology/job_filtering_test.go

…Info, remove JobRequirement Remove deprecated Resource-based fields and types: - PodInfo: remove ResReq, AcceptedResource; add GpuRequirement, AcceptedGpuRequirement - PodGroupInfo: remove Allocated, tasksToAllocateInitResource - Remove JobRequirement struct entirely, replace with ResourceVector/ResourceQuantities All resource operations now use vectors exclusively. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Erez Freiberger <enoodle@gmail.com>

…stant Signed-off-by: Erez Freiberger <enoodle@gmail.com>

github-actions · 2026-04-07T14:37:47Z

Merging this branch changes the coverage (3 decrease, 6 increase)

Impacted Packages	Coverage Δ	🤖
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common	23.11% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers	22.22% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus	86.57% (-0.06%)	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/utils	56.48% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api	27.45% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/common_info	64.55% (+2.59%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/node_info	72.16% (+0.57%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/pod_info	67.81% (+1.29%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/podgroup_info	59.63% (-0.08%)	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/resource_info	49.70% (-3.03%)	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/cache	34.25% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/framework	36.80% (+0.91%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/gpu_sharing	43.24% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/k8s_internal/predicates	66.47% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/elastic	91.67% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/gpusharingorder	0.00% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/nodeavailability	87.50% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/nodeplacement	92.00% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/predicates	71.77% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion	37.56% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/capacity_policy	98.65% (+0.04%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/queue_order	64.60% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/reclaimable	92.44% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/resourcetype	90.00% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/topology	88.21% (+0.26%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/test_utils	0.00% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/test_utils/jobs_fake	0.00% (ø)

Coverage by file

Changed files (no unit tests)

Changed File	Coverage Δ	Total	Covered	Missed	🤖
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/minimal_job_comparison.go	93.33% (ø)	45	42	3
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/idle_gpus.go	77.53% (ø)	89	69	20
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/topology_aware_idle_gpus.go	92.31% (-0.08%)	91 (-1)	84 (-1)	7	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/common_info/pod_errors.go	52.69% (+6.10%)	93 (+5)	49 (+8)	44 (-3)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/node_info/gpu_sharing_node_info.go	68.46% (ø)	149	102	47
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/node_info/node_info.go	74.00% (+0.85%)	300 (+2)	222 (+4)	78 (-2)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/pod_info/pod_info.go	64.24% (+1.90%)	165 (+3)	106 (+5)	59 (-2)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/podgroup_info/allocation_info.go	85.54% (+0.13%)	83 (-13)	71 (-11)	12 (-2)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/podgroup_info/job_info.go	41.46% (+0.53%)	205 (-10)	85 (-3)	120 (-7)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/resource_info/gpu_resource_requirment.go	61.25% (+0.76%)	80 (-1)	49	31 (-1)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/resource_info/resource_info.go	9.73% (-3.52%)	113 (+30)	11	102 (+30)	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/unschedule_info.go	46.67% (ø)	30	14	16
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/cache/cache.go	52.71% (ø)	129	68	61
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/framework/session.go	0.53% (ø)	188	1	187
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/framework/statement_test_utils.go	83.93% (+1.93%)	112 (+12)	94 (+12)	18	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/gpu_sharing/gpuSharing.go	43.24% (ø)	37	16	21
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/k8s_internal/predicates/maxNodeResources.go	92.00% (ø)	50	46	4
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/gpusharingorder/gpusharingorder.go	0.00% (ø)	10	0	10
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/predicates/predicates.go	71.77% (ø)	124	89	35
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/capacity_policy/capacity_policy.go	100.00% (ø)	24 (-2)	24 (-2)	0
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/capacity_policy/quota_check.go	96.30% (+0.64%)	27 (+4)	26 (+4)	1	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/proportion.go	37.56% (ø)	221	83	138
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/topology/job_filtering.go	94.27% (+0.19%)	279 (+9)	263 (+9)	16	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/test_utils/jobs_fake/jobs.go	0.00% (ø)	151 (+4)	0	151 (+4)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/test_utils/test_utils.go	0.00% (ø)	132	0	132
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/test_utils/test_utils_builder.go	0.00% (ø)	130	0	130

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/feasible_nodes_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/idle_gpus_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers/pod_scenario_builder_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/utils/job_order_by_queue_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/common_info/pod_errors_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/pod_info/pod_info_benchmark_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/pod_info/pod_info_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/podgroup_info/allocation_info_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/podgroup_info/job_info_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/unschedule_info_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/framework/statement_checkpoint_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/framework/statement_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/elastic/elastic_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/nodeavailability/nodeavailability_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/nodeplacement/nodespread_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/capacity_policy/capacity_policy_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/proportion_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/queue_order/queue_order_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/reclaimable/reclaimable_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/resourcetype/resourcetype_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/topology/job_filtering_test.go

Signed-off-by: Erez Freiberger <enoodle@gmail.com>

github-actions · 2026-04-07T21:52:27Z

Merging this branch changes the coverage (4 decrease, 6 increase)

Impacted Packages	Coverage Δ	🤖
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common	23.11% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers	22.22% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus	86.57% (-0.06%)	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/utils	56.48% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api	27.45% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/common_info	64.55% (+2.59%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/node_info	72.04% (+0.45%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/pod_info	67.67% (+1.15%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/podgroup_info	59.82% (+0.10%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/resource_info	50.30% (-2.43%)	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/cache	34.25% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/cache/cluster_info	83.18% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/framework	35.66% (-0.23%)	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/gpu_sharing	43.24% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/k8s_internal/predicates	66.47% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/elastic	91.67% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/gpusharingorder	0.00% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/nodeavailability	87.50% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/nodeplacement	92.00% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/predicates	71.54% (-0.23%)	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion	37.56% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/capacity_policy	98.65% (+0.04%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/queue_order	64.60% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/reclaimable	92.44% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/resourcetype	90.00% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/topology	88.12% (+0.17%)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/test_utils	0.00% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/test_utils/jobs_fake	0.00% (ø)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/test_utils/nodes_fake	0.00% (ø)

Coverage by file

Changed files (no unit tests)

Changed File	Coverage Δ	Total	Covered	Missed	🤖
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/minimal_job_comparison.go	93.33% (ø)	45	42	3
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/idle_gpus.go	77.53% (ø)	89	69	20
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/topology_aware_idle_gpus.go	92.31% (-0.08%)	91 (-1)	84 (-1)	7	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/common_info/pod_errors.go	52.69% (+6.10%)	93 (+5)	49 (+8)	44 (-3)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/node_info/gpu_sharing_node_info.go	68.24% (-0.21%)	148 (-1)	101 (-1)	47	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/node_info/node_info.go	73.91% (+0.76%)	299 (+1)	221 (+3)	78 (-2)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/pod_info/pod_info.go	64.02% (+1.68%)	164 (+2)	105 (+4)	59 (-2)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/podgroup_info/allocation_info.go	85.54% (+0.13%)	83 (-13)	71 (-11)	12 (-2)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/podgroup_info/job_info.go	41.67% (+0.74%)	204 (-11)	85 (-3)	119 (-8)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/resource_info/gpu_resource_requirment.go	61.25% (+0.76%)	80 (-1)	49	31 (-1)	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/resource_info/resource_info.go	10.28% (-2.97%)	107 (+24)	11	96 (+24)	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/unschedule_info.go	46.67% (ø)	30	14	16
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/cache/cache.go	52.71% (ø)	129	68	61
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/framework/session.go	0.53% (ø)	188	1	187
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/framework/statement_test_utils.go	81.44% (-0.56%)	97 (-3)	79 (-3)	18	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/gpu_sharing/gpuSharing.go	43.24% (ø)	37	16	21
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/k8s_internal/predicates/maxNodeResources.go	92.00% (ø)	50	46	4
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/gpusharingorder/gpusharingorder.go	0.00% (ø)	10	0	10
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/predicates/predicates.go	71.54% (-0.23%)	123 (-1)	88 (-1)	35	👎
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/capacity_policy/capacity_policy.go	100.00% (ø)	24 (-2)	24 (-2)	0
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/capacity_policy/quota_check.go	96.30% (+0.64%)	27 (+4)	26 (+4)	1	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/proportion.go	37.56% (ø)	221	83	138
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/topology/job_filtering.go	94.20% (+0.13%)	276 (+6)	260 (+6)	16	👍
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/test_utils/jobs_fake/jobs.go	0.00% (ø)	151 (+4)	0	151 (+4)
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/test_utils/nodes_fake/nodes.go	0.00% (ø)	97	0	97
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/test_utils/test_utils.go	0.00% (ø)	132	0	132
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/test_utils/test_utils_builder.go	0.00% (ø)	130	0	130

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/feasible_nodes_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/idle_gpus_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/common/solvers/pod_scenario_builder_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/actions/utils/job_order_by_queue_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/common_info/pod_errors_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/node_info/node_info_storage_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/pod_info/pod_info_benchmark_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/pod_info/pod_info_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/podgroup_info/allocation_info_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/podgroup_info/job_info_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/api/unschedule_info_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/cache/cluster_info/cluster_info_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/framework/statement_checkpoint_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/framework/statement_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/elastic/elastic_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/nodeavailability/nodeavailability_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/nodeplacement/nodespread_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/capacity_policy/capacity_policy_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/proportion_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/queue_order/queue_order_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/proportion/reclaimable/reclaimable_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/resourcetype/resourcetype_test.go
github.com/kai-scheduler/KAI-scheduler/pkg/scheduler/plugins/topology/job_filtering_test.go

enoodle force-pushed the erez/remove-resource-fields-pod-info branch from d76a81e to 5add9ad Compare March 18, 2026 09:33

coderabbitai Bot reviewed Mar 18, 2026

View reviewed changes

enoodle added the skip-changelog label Mar 18, 2026

This was referenced Mar 30, 2026

refactor(scheduler): vectorize resource representation for performance at scale #1353

Open

refactor(scheduler): remove Resource fields from PodInfo and PodGroupInfo #1354

Closed

enoodle force-pushed the erez/remove-resource-fields-pod-info branch from 5add9ad to d6a4e73 Compare April 6, 2026 09:40

enoodle added 2 commits April 7, 2026 16:00

refactor(scheduler): replace hardcoded "gpu" string with GPUIndex con…

fe7f3f6

…stant Signed-off-by: Erez Freiberger <enoodle@gmail.com>

enoodle force-pushed the erez/remove-resource-fields-pod-info branch from d6a4e73 to fe7f3f6 Compare April 7, 2026 14:00

refactor(scheduler): remove GetIndex calls where possible

d86c6f6

Signed-off-by: Erez Freiberger <enoodle@gmail.com>

itsomri approved these changes Apr 9, 2026

View reviewed changes

enoodle added this pull request to the merge queue Apr 9, 2026

Merged via the queue into main with commit 91be47d Apr 9, 2026
17 checks passed

enoodle deleted the erez/remove-resource-fields-pod-info branch April 9, 2026 14:57

Conversation

enoodle commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

enoodle commented Mar 18, 2026

Uh oh!

coderabbitai Bot commented Mar 18, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📊 Performance Benchmark Results

Legend

Uh oh!

github-actions Bot commented Mar 18, 2026

Merging this branch changes the coverage (3 decrease, 6 increase)

Changed files (no unit tests)

Changed unit test files

Uh oh!

github-actions Bot commented Apr 7, 2026

Merging this branch changes the coverage (3 decrease, 6 increase)

Changed files (no unit tests)

Changed unit test files

Uh oh!

github-actions Bot commented Apr 7, 2026

Merging this branch changes the coverage (4 decrease, 6 increase)

Changed files (no unit tests)

Changed unit test files

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

enoodle commented Mar 18, 2026 •

edited

Loading

coderabbitai Bot commented Mar 18, 2026 •

edited

Loading

github-actions Bot commented Mar 18, 2026 •

edited

Loading