refactor(scheduler): remove Resource fields from PodInfo and PodGroup…#1238
refactor(scheduler): remove Resource fields from PodInfo and PodGroup…#1238
Conversation
📝 WalkthroughWalkthroughThis pull request refactors the scheduler's resource representation model by separating GPU-specific resource requirements from general resource handling. Core structs like Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
d76a81e to
5add9ad
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/scheduler/api/node_info/node_info.go (1)
311-315:⚠️ Potential issue | 🟠 MajorReject GPU-memory requests on DRA-only nodes too.
GpuRequirement.GPUs() > 0is not a complete proxy for device-plugin GPU usage. GPU-memory requests are tracked separately inpod_info.PodInfo, so they can skip this branch and still be scheduled onto DRA-only nodes.Suggested fix
- if task.GpuRequirement.GPUs() > 0 && ni.HasDRAGPUs { + if ni.HasDRAGPUs && task.IsRequireAnyKindOfGPU() && task.GpuRequirement.GetDraGpusCount() == 0 { log.InfraLogger.V(4).Infof("Task %s/%s rejected on node %s: device-plugin GPU request on DRA-only node", task.Namespace, task.Name, ni.Name) return common_info.NewFitError(task.Name, task.Namespace, ni.Name, "device-plugin GPU requests cannot be scheduled on DRA-only nodes") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/scheduler/api/node_info/node_info.go` around lines 311 - 315, The current rejection only checks task.GpuRequirement.GPUs() and misses GPU-memory-only requests tracked on pod_info.PodInfo, so update the condition that uses ni.HasDRAGPUs to also detect GPU-memory requests from the pod info; e.g., change the if in node_info.go that reads "if task.GpuRequirement.GPUs() > 0 && ni.HasDRAGPUs" to check "if (task.GpuRequirement.GPUs() > 0 || task.PodInfo.HasGPUMemoryRequest() /* or task.PodInfo.GPUMemoryRequest > 0 */) && ni.HasDRAGPUs" (use the actual PodInfo field or helper in pod_info.PodInfo present in the codebase) so device-plugin GPU requests and GPU-memory-only requests are both rejected on DRA-only nodes and the same NewFitError message is returned.
🧹 Nitpick comments (7)
pkg/scheduler/plugins/topology/job_filtering_test.go (1)
2181-2194: Prefer a per-testResourceVectorMapinstead of the shared global.Using
testVectorMaphere couples table cases through shared mutable state and can make tests order-dependent.♻️ Suggested refactor
- jobsInfoMap, _, _ := jobs_fake.BuildJobsAndTasksMaps( - []*jobs_fake.TestJobBasic{tt.job}, testVectorMap) + vectorMap := resource_info.NewResourceVectorMap() + jobsInfoMap, _, _ := jobs_fake.BuildJobsAndTasksMaps( + []*jobs_fake.TestJobBasic{tt.job}, vectorMap) job := jobsInfoMap[common_info.PodGroupID(tt.job.Name)] @@ - result, err := plugin.getJobAllocatableDomains(job, &job.RootSubGroupSet.SubGroupInfo, - job.RootSubGroupSet.GetAllPodSets(), tasksResources.ToVector(testVectorMap), tasksCount, + result, err := plugin.getJobAllocatableDomains(job, &job.RootSubGroupSet.SubGroupInfo, + job.RootSubGroupSet.GetAllPodSets(), tasksResources.ToVector(vectorMap), tasksCount, tt.topologyTree)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/scheduler/plugins/topology/job_filtering_test.go` around lines 2181 - 2194, The test uses the shared mutable testVectorMap which can couple test cases; make each subtest create its own ResourceVectorMap and use that instance when building jobs/tasks and converting tasksResources to a vector. Specifically, in the table-driven test replace the shared testVectorMap with a per-test variable (e.g., localVectorMap) before calling jobs_fake.BuildJobsAndTasksMaps, pass localVectorMap into podgroup_info.GetTasksToAllocate/any other helpers that need the map, and call tasksResources.ToVector(localVectorMap) when invoking plugin.getJobAllocatableDomains so each test runs with its own ResourceVectorMap and no shared state.pkg/scheduler/cache/cache.go (1)
273-274: Log wording is now misleading for the provided payload.The message says “requires … GPUs” but logs
ResReqVector, which is not GPU-only and can be hard to interpret in triage.💡 Proposed log cleanup
- "Creating bind request for task <%v/%v> to node <%v> gpuGroup: <%v>, requires: <%v> GPUs", - taskInfo.Namespace, taskInfo.Name, hostname, taskInfo.GPUGroups, taskInfo.ResReqVector) + "Creating bind request for task <%v/%v> to node <%v> gpuGroup: <%v>, requires GPU quota: <%v>", + taskInfo.Namespace, taskInfo.Name, hostname, taskInfo.GPUGroups, taskInfo.GpuRequirement.GetGpusQuota())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/scheduler/cache/cache.go` around lines 273 - 274, The log string is misleading because it claims "requires ... GPUs" while printing taskInfo.ResReqVector (a full resource vector); update the format string used when creating the bind request log to reflect the actual payload by either printing a clear label for the resource vector (e.g., "resReqVector: <%v>") or by separately logging GPU count from taskInfo.GPUGroups (e.g., "gpuGroups: <%v>, resReqVector: <%v>")—locate the log invocation that uses taskInfo.Namespace, taskInfo.Name, hostname, taskInfo.GPUGroups, and taskInfo.ResReqVector and change the message text to accurately describe those fields.pkg/scheduler/plugins/resourcetype/resourcetype_test.go (1)
119-125: Initialize vector fields in DRA task fixture for future-proofing.
createFakeTaskWithDRAcurrently leavesResReqVector/VectorMapunset. If scoring logic later touches vector paths, this helper can become brittle.♻️ Proposed fixture hardening
func createFakeTaskWithDRA(taskName string, draGpuCount int64) *pod_info.PodInfo { gpuReq := resource_info.NewGpuResourceRequirement() gpuReq.SetDraGpus(map[string]int64{"nvidia.com/gpu": draGpuCount}) + reqVector := resource_info.NewResourceVectorWithValues(0, 0, gpuReq.GetGpusQuota(), testVectorMap) return &pod_info.PodInfo{ Name: taskName, GpuRequirement: *gpuReq, + ResReqVector: reqVector, + VectorMap: testVectorMap, Pod: &v1.Pod{ ObjectMeta: metav1.ObjectMeta{ CreationTimestamp: metav1.Now(), }, }, } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/scheduler/plugins/resourcetype/resourcetype_test.go` around lines 119 - 125, createFakeTaskWithDRA currently returns a PodInfo whose GpuRequirement leaves vector paths nil; to harden the fixture initialize the vector fields before returning: on the gpuReq (created by resource_info.NewGpuResourceRequirement()) set its ResReqVector to an empty/new ResReqVector instance and its VectorMap to an empty map (using the appropriate types from resource_info) so ResReqVector/VectorMap are non-nil in the returned pod_info.PodInfo; update createFakeTaskWithDRA accordingly.pkg/scheduler/plugins/proportion/capacity_policy/quota_check.go (1)
86-92: Build the over-capacity vector fromrequestedQuota, not three hard-coded resources.
isAllocatedNonPreemptibleOverQuotaalready reasons overrs.AllResources, but this message path rebuildsvecwith only GPU/CPU/memory. A small helper that populates the vector from the map would keep the error text from drifting the next time queue-tracked resources change.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/scheduler/plugins/proportion/capacity_policy/quota_check.go` around lines 86 - 92, The over-capacity vector is being constructed only for GPU/CPU/memory instead of using the full requestedQuota map; change the vec build in the block that calls GetBuildOverCapacityMessageForQueue to iterate over rs.AllResources (or the keys of requestedQuota) and for each resource call vec.Set(vectorMap.GetIndex(resourceName), requestedQuota[resourceName]) so the vector reflects all tracked resources; keep using resource_info.NewResourceVectorMap() and NewResourceVector(...) and pass that populated vec and vectorMap into api.GetBuildOverCapacityMessageForQueue.pkg/scheduler/api/resource_info/resource_info.go (2)
158-180: Consider using theGpuResourceconstant instead of hardcoded "gpu" string.Line 161 uses a hardcoded
"gpu"string forvectorMap.GetIndex("gpu"). For consistency with the rest of the codebase, consider using thecommonconstants.GpuResourceconstant (as seen in other files likegpu_sharing_node_info.go).Additionally, the conversion from
float64toint64at line 174 may silently truncate fractional scalar resource values. If fractional quantities are valid for some scalar resources, this could cause precision loss.Suggested improvements
func (r *Resource) AddVectorAndGpuReq(vec ResourceVector, vectorMap *ResourceVectorMap, gpuReq *GpuResourceRequirement) { cpuIdx := vectorMap.GetIndex(v1.ResourceCPU) memIdx := vectorMap.GetIndex(v1.ResourceMemory) - gpuIdx := vectorMap.GetIndex("gpu") + gpuIdx := vectorMap.GetIndex(commonconstants.GpuResource) r.milliCpu += vec.Get(cpuIdx) r.memory += vec.Get(memIdx) r.gpus += gpuReq.GPUs() for i := 0; i < vectorMap.Len(); i++ { if i == cpuIdx || i == memIdx || i == gpuIdx { continue } val := vec.Get(i) if val != 0 { rName := v1.ResourceName(vectorMap.ResourceAt(i)) - r.BaseResource.scalarResources[rName] += int64(val) + r.BaseResource.scalarResources[rName] += int64(math.Round(val)) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/scheduler/api/resource_info/resource_info.go` around lines 158 - 180, Replace the hardcoded "gpu" string in AddVectorAndGpuReq by using the shared constant commonconstants.GpuResource when calling vectorMap.GetIndex to match the rest of the codebase; also address the float64→int64 conversion when writing to r.BaseResource.scalarResources in AddVectorAndGpuReq by either converting with a rounding step (e.g., int64(math.Round(val))) or changing the scalarResources storage type to preserve fractional values, and update imports/usages accordingly (referencing AddVectorAndGpuReq, ResourceVectorMap.GetIndex, r.BaseResource.scalarResources, and GpuResource constant).
193-215: Same concerns apply as inAddVectorAndGpuReq.The hardcoded
"gpu"string at line 196 and thefloat64toint64conversion at line 209 have the same concerns as noted forAddVectorAndGpuReq. Consider applying the same fixes for consistency.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/scheduler/api/resource_info/resource_info.go` around lines 193 - 215, SubVectorAndGpuReq repeats the same issues as AddVectorAndGpuReq: a hardcoded "gpu" resource name and an unsafe float64→int64 conversion when subtracting scalar resources. Replace the literal "gpu" index lookup with the same canonical resource identifier used in AddVectorAndGpuReq (use the vectorMap lookup for the canonical GPU resource constant) and when subtracting vec.Get(i) from r.BaseResource.scalarResources[rName] convert the float value to an integer safely (use the same numeric conversion used in AddVectorAndGpuReq, e.g., rounding or explicit integer-safe conversion) so you don’t truncate/corrupt resource counts; keep references to SubVectorAndGpuReq, AddVectorAndGpuReq, vectorMap.GetIndex, vectorMap.ResourceAt and r.BaseResource.scalarResources to locate the changes.pkg/scheduler/api/unschedule_info.go (1)
20-57: Consider using a constant for "gpu" resource name.The function correctly migrates to vector-based resource access. However, there's an inconsistency in how resource names are accessed:
- Line 33: uses hardcoded
"gpu"string- Line 41: uses
v1.ResourceCPUconstant- Line 49: uses
v1.ResourceMemoryconstantFor consistency and maintainability, consider using a constant for the GPU resource name (e.g.,
commonconstants.GpuResourceorconstants.GpuResourceas used elsewhere in the codebase).Suggested fix
+import commonconstants "github.com/kai-scheduler/KAI-scheduler/pkg/common/constants" + func getOverCapacityMessageDetails(queueName, resourceName string, deserved, used float64, requestedResources resource_info.ResourceVector, vectorMap *resource_info.ResourceVectorMap) string { switch resourceName { case GpuResource: return fmt.Sprintf("Workload requested %v GPUs, but %s quota is %v GPUs, "+ "while %v GPUs are already allocated for non-preemptible pods.", - requestedResources.Get(vectorMap.GetIndex("gpu")), + requestedResources.Get(vectorMap.GetIndex(commonconstants.GpuResource)),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/scheduler/api/unschedule_info.go` around lines 20 - 57, The code mixes a hardcoded "gpu" string with constants for CPU/Memory in getOverCapacityMessageDetails/GetBuildOverCapacityMessageForQueue; replace the literal "gpu" passed to vectorMap.GetIndex with the project constant used elsewhere (e.g., commonconstants.GpuResource or the repo's GpuResource constant) so GPU lookups match v1.ResourceCPU/v1.ResourceMemory usage and are maintainable; update any imports/usages referenced by getOverCapacityMessageDetails and GetBuildOverCapacityMessageForQueue to use that constant.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/scheduler/api/node_info/node_info.go`:
- Around line 688-695: GetRequiredInitQuota is overwriting the total GPU slot in
result with a per-device fractional value, which reduces multi-GPU requests;
instead only replace the GPU entry when the request is a pure GPU-memory request
(i.e., no explicit GPU count in pi.ResReqVector but only memory-based
requirement). Modify GetRequiredInitQuota to: obtain gpuIdx via
ni.VectorMap.GetIndex("gpu"), detect if pi.GpuRequirement is memory-only (e.g.,
MigResources() non-empty logic inverted or check whether pi.ResReqVector already
contains a GPU count), and only call result.Set(gpuIdx,
ni.getGpuMemoryFractionalOnNode(ni.GetResourceGpuMemory(&pi.GpuRequirement)))
for pure GPU-memory cases; otherwise leave the existing value from
pi.ResReqVector untouched so multi-device totals (2, 3 x 0.5, etc.) are
preserved.
In `@pkg/scheduler/api/pod_info/pod_info_test.go`:
- Around line 466-475: The test fixture creates a PodInfo without ResReqVector
or AcceptedResourceVector, but updatePodAdditionalFields() now rebuilds
ResReqVector at the end; to mirror production and catch regressions, initialize
those vectors in the fixture: after constructing the PodInfo (pi :=
&PodInfo{...}), either call updatePodAdditionalFields(pi) or explicitly populate
pi.ResReqVector and pi.AcceptedResourceVector (derived from pi.GpuRequirement
and vectorMap) so the test uses the same vector state as the real code.
In `@pkg/scheduler/api/pod_info/pod_info.go`:
- Around line 261-265: SetVectorMap currently only swaps pi.VectorMap leaving
pi.ResReqVector and pi.AcceptedResourceVector using the old map indices; update
SetVectorMap to rebuild both vectors to match the new map: use the new vectorMap
to re-create pi.ResReqVector from the pod's resource request fields (so GetIndex
lookups map correctly) and reinitialize or remap pi.AcceptedResourceVector to
the new length/indices (preserving any known accepted values if available,
otherwise zeroing), ensuring PodInfo.SetVectorMap and PodGroupInfo.SetVectorMap
operate safely with the new map.
In `@pkg/scheduler/api/podgroup_info/job_info.go`:
- Around line 455-463: SetVectorMap currently replaces pgi.VectorMap and
rebuilds pgi.AllocatedVector but still sums the tasks' existing ResReqVector
values which may be bound to the old map; iterate pgi.GetAllPodsMap() and for
each task rebind or project its ResReqVector onto the new vectorMap (e.g. call
the vector rebinding/projection helper used elsewhere in the codebase) before
calling pgi.AllocatedVector.Add(...), so AllocatedVector is computed from
vectors that are consistent with pgi.VectorMap; update SetVectorMap to perform
this rebind/projection step for each task prior to aggregation.
In `@pkg/scheduler/plugins/proportion/capacity_policy/capacity_policy.go`:
- Around line 50-52: The slice access using node.VectorMap.GetIndex(...) on
requiredInitQuota is unsafe because GetIndex can return -1 (e.g., GPU may be
absent) and direct indexing causes panics; update the code that builds the tuple
using requiredInitQuota[node.VectorMap.GetIndex(v1.ResourceCPU)],
requiredInitQuota[node.VectorMap.GetIndex(v1.ResourceMemory)],
requiredInitQuota[node.VectorMap.GetIndex(constants.GpuResource)] to use safe
lookups instead — either call the VectorMap.Get(index) accessor that handles
bounds or guard the index (check GetIndex(...) >= 0) before indexing, mirroring
the pattern in GetRequiredInitQuota which uses .Set/checked access; ensure you
reference GetIndex, VectorMap, requiredInitQuota, v1.ResourceCPU,
v1.ResourceMemory, and constants.GpuResource when making the change.
In `@pkg/scheduler/plugins/topology/job_filtering_test.go`:
- Around line 2205-2211: The test currently reports a length mismatch but still
indexes job.JobFitErrors[i], which can panic when actual fit errors are shorter
than expected; update the validation in the test (job_filtering_test.go) to
guard access to job.JobFitErrors before indexing—either iterate only up to
min(len(job.JobFitErrors), len(tt.expectedFitErrors)) or skip per-item
comparisons when the lengths differ (e.g., continue after the initial length
check) so accesses like job.JobFitErrors[i] are never out-of-bounds.
---
Outside diff comments:
In `@pkg/scheduler/api/node_info/node_info.go`:
- Around line 311-315: The current rejection only checks
task.GpuRequirement.GPUs() and misses GPU-memory-only requests tracked on
pod_info.PodInfo, so update the condition that uses ni.HasDRAGPUs to also detect
GPU-memory requests from the pod info; e.g., change the if in node_info.go that
reads "if task.GpuRequirement.GPUs() > 0 && ni.HasDRAGPUs" to check "if
(task.GpuRequirement.GPUs() > 0 || task.PodInfo.HasGPUMemoryRequest() /* or
task.PodInfo.GPUMemoryRequest > 0 */) && ni.HasDRAGPUs" (use the actual PodInfo
field or helper in pod_info.PodInfo present in the codebase) so device-plugin
GPU requests and GPU-memory-only requests are both rejected on DRA-only nodes
and the same NewFitError message is returned.
---
Nitpick comments:
In `@pkg/scheduler/api/resource_info/resource_info.go`:
- Around line 158-180: Replace the hardcoded "gpu" string in AddVectorAndGpuReq
by using the shared constant commonconstants.GpuResource when calling
vectorMap.GetIndex to match the rest of the codebase; also address the
float64→int64 conversion when writing to r.BaseResource.scalarResources in
AddVectorAndGpuReq by either converting with a rounding step (e.g.,
int64(math.Round(val))) or changing the scalarResources storage type to preserve
fractional values, and update imports/usages accordingly (referencing
AddVectorAndGpuReq, ResourceVectorMap.GetIndex, r.BaseResource.scalarResources,
and GpuResource constant).
- Around line 193-215: SubVectorAndGpuReq repeats the same issues as
AddVectorAndGpuReq: a hardcoded "gpu" resource name and an unsafe float64→int64
conversion when subtracting scalar resources. Replace the literal "gpu" index
lookup with the same canonical resource identifier used in AddVectorAndGpuReq
(use the vectorMap lookup for the canonical GPU resource constant) and when
subtracting vec.Get(i) from r.BaseResource.scalarResources[rName] convert the
float value to an integer safely (use the same numeric conversion used in
AddVectorAndGpuReq, e.g., rounding or explicit integer-safe conversion) so you
don’t truncate/corrupt resource counts; keep references to SubVectorAndGpuReq,
AddVectorAndGpuReq, vectorMap.GetIndex, vectorMap.ResourceAt and
r.BaseResource.scalarResources to locate the changes.
In `@pkg/scheduler/api/unschedule_info.go`:
- Around line 20-57: The code mixes a hardcoded "gpu" string with constants for
CPU/Memory in getOverCapacityMessageDetails/GetBuildOverCapacityMessageForQueue;
replace the literal "gpu" passed to vectorMap.GetIndex with the project constant
used elsewhere (e.g., commonconstants.GpuResource or the repo's GpuResource
constant) so GPU lookups match v1.ResourceCPU/v1.ResourceMemory usage and are
maintainable; update any imports/usages referenced by
getOverCapacityMessageDetails and GetBuildOverCapacityMessageForQueue to use
that constant.
In `@pkg/scheduler/cache/cache.go`:
- Around line 273-274: The log string is misleading because it claims "requires
... GPUs" while printing taskInfo.ResReqVector (a full resource vector); update
the format string used when creating the bind request log to reflect the actual
payload by either printing a clear label for the resource vector (e.g.,
"resReqVector: <%v>") or by separately logging GPU count from taskInfo.GPUGroups
(e.g., "gpuGroups: <%v>, resReqVector: <%v>")—locate the log invocation that
uses taskInfo.Namespace, taskInfo.Name, hostname, taskInfo.GPUGroups, and
taskInfo.ResReqVector and change the message text to accurately describe those
fields.
In `@pkg/scheduler/plugins/proportion/capacity_policy/quota_check.go`:
- Around line 86-92: The over-capacity vector is being constructed only for
GPU/CPU/memory instead of using the full requestedQuota map; change the vec
build in the block that calls GetBuildOverCapacityMessageForQueue to iterate
over rs.AllResources (or the keys of requestedQuota) and for each resource call
vec.Set(vectorMap.GetIndex(resourceName), requestedQuota[resourceName]) so the
vector reflects all tracked resources; keep using
resource_info.NewResourceVectorMap() and NewResourceVector(...) and pass that
populated vec and vectorMap into api.GetBuildOverCapacityMessageForQueue.
In `@pkg/scheduler/plugins/resourcetype/resourcetype_test.go`:
- Around line 119-125: createFakeTaskWithDRA currently returns a PodInfo whose
GpuRequirement leaves vector paths nil; to harden the fixture initialize the
vector fields before returning: on the gpuReq (created by
resource_info.NewGpuResourceRequirement()) set its ResReqVector to an empty/new
ResReqVector instance and its VectorMap to an empty map (using the appropriate
types from resource_info) so ResReqVector/VectorMap are non-nil in the returned
pod_info.PodInfo; update createFakeTaskWithDRA accordingly.
In `@pkg/scheduler/plugins/topology/job_filtering_test.go`:
- Around line 2181-2194: The test uses the shared mutable testVectorMap which
can couple test cases; make each subtest create its own ResourceVectorMap and
use that instance when building jobs/tasks and converting tasksResources to a
vector. Specifically, in the table-driven test replace the shared testVectorMap
with a per-test variable (e.g., localVectorMap) before calling
jobs_fake.BuildJobsAndTasksMaps, pass localVectorMap into
podgroup_info.GetTasksToAllocate/any other helpers that need the map, and call
tasksResources.ToVector(localVectorMap) when invoking
plugin.getJobAllocatableDomains so each test runs with its own ResourceVectorMap
and no shared state.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 9b0f5e4a-f563-46b6-bded-124627a7c032
📒 Files selected for processing (47)
pkg/scheduler/actions/common/feasible_nodes_test.gopkg/scheduler/actions/common/minimal_job_comparison.gopkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/idle_gpus.gopkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/idle_gpus_test.gopkg/scheduler/actions/common/solvers/accumulated_scenario_filters/idle_gpus/topology_aware_idle_gpus.gopkg/scheduler/actions/common/solvers/pod_scenario_builder_test.gopkg/scheduler/actions/utils/job_order_by_queue_test.gopkg/scheduler/api/common_info/pod_errors.gopkg/scheduler/api/common_info/pod_errors_test.gopkg/scheduler/api/node_info/gpu_sharing_node_info.gopkg/scheduler/api/node_info/node_info.gopkg/scheduler/api/pod_info/pod_info.gopkg/scheduler/api/pod_info/pod_info_benchmark_test.gopkg/scheduler/api/pod_info/pod_info_test.gopkg/scheduler/api/podgroup_info/allocation_info.gopkg/scheduler/api/podgroup_info/allocation_info_test.gopkg/scheduler/api/podgroup_info/job_info.gopkg/scheduler/api/podgroup_info/job_info_test.gopkg/scheduler/api/resource_info/gpu_resource_requirment.gopkg/scheduler/api/resource_info/resource_info.gopkg/scheduler/api/unschedule_info.gopkg/scheduler/api/unschedule_info_test.gopkg/scheduler/cache/cache.gopkg/scheduler/framework/session.gopkg/scheduler/framework/statement_checkpoint_test.gopkg/scheduler/framework/statement_test.gopkg/scheduler/framework/statement_test_utils.gopkg/scheduler/gpu_sharing/gpuSharing.gopkg/scheduler/k8s_internal/predicates/maxNodeResources.gopkg/scheduler/plugins/elastic/elastic_test.gopkg/scheduler/plugins/gpusharingorder/gpusharingorder.gopkg/scheduler/plugins/nodeavailability/nodeavailability_test.gopkg/scheduler/plugins/nodeplacement/nodespread_test.gopkg/scheduler/plugins/predicates/predicates.gopkg/scheduler/plugins/proportion/capacity_policy/capacity_policy.gopkg/scheduler/plugins/proportion/capacity_policy/capacity_policy_test.gopkg/scheduler/plugins/proportion/capacity_policy/quota_check.gopkg/scheduler/plugins/proportion/proportion.gopkg/scheduler/plugins/proportion/proportion_test.gopkg/scheduler/plugins/proportion/queue_order/queue_order_test.gopkg/scheduler/plugins/proportion/reclaimable/reclaimable_test.gopkg/scheduler/plugins/resourcetype/resourcetype_test.gopkg/scheduler/plugins/topology/job_filtering.gopkg/scheduler/plugins/topology/job_filtering_test.gopkg/scheduler/test_utils/jobs_fake/jobs.gopkg/scheduler/test_utils/test_utils.gopkg/scheduler/test_utils/test_utils_builder.go
💤 Files with no reviewable changes (1)
- pkg/scheduler/plugins/proportion/proportion_test.go
📊 Performance Benchmark ResultsComparing PR (
|
Merging this branch changes the coverage (3 decrease, 6 increase)
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. Changed unit test files
|
5add9ad to
d6a4e73
Compare
…Info, remove JobRequirement Remove deprecated Resource-based fields and types: - PodInfo: remove ResReq, AcceptedResource; add GpuRequirement, AcceptedGpuRequirement - PodGroupInfo: remove Allocated, tasksToAllocateInitResource - Remove JobRequirement struct entirely, replace with ResourceVector/ResourceQuantities All resource operations now use vectors exclusively. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Erez Freiberger <enoodle@gmail.com>
…stant Signed-off-by: Erez Freiberger <enoodle@gmail.com>
d6a4e73 to
fe7f3f6
Compare
Merging this branch changes the coverage (3 decrease, 6 increase)
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. Changed unit test files
|
Signed-off-by: Erez Freiberger <enoodle@gmail.com>
Merging this branch changes the coverage (4 decrease, 6 increase)
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. Changed unit test files
|
…Info, remove JobRequirement
Description
Remove deprecated Resource-based fields and types:
All resource operations now use vectors exclusively.
Fixes #1354
Summary by CodeRabbit