PCP-6077: vm placement selection with resource and distribution aware (#317)#321
Conversation
There was a problem hiding this comment.
Pull request overview
Adds resource-, tag-, and control-plane distribution–aware LXD VM host selection in the MAAS provider, and introduces control-plane VM tagging to support anti-affinity and maintenance workflows.
Changes:
- Replace LXD host selection with a filter-then-rank selector (zone/pool/tags/min resources + CP anti-affinity + managed-host tie-break).
- Tag control-plane VMs with cluster identity tags to enable distribution-aware placement and maintenance operations.
- Update MAAS client dependency version and expand unit tests for host selection logic.
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/maas/machine/machine.go | Builds new host-selection options (zone/pool/tags/resources + cluster ID) and adds CP VM tagging + safer zone extraction. |
| pkg/maas/lxd/host_maas_client.go | Implements new LXD host selection algorithm with filtering and ranking (including CP distribution + resource scoring). |
| pkg/maas/lxd/host_maas_client_test.go | Adds/updates extensive unit tests and fakes for new selector behaviors (tags/resources/maintenance/anti-affinity). |
| go.mod | Changes MAAS client module version. |
| go.sum | Updates checksums to match dependency changes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func SelectLXDHostWithMaasClient(client lxdHostSelectorClient, hosts []maasclient.VMHost, opts SelectOptions) (maasclient.VMHost, error) { | ||
| log := textlogger.NewLogger(textlogger.NewConfig()) | ||
| ctx := context.Background() |
There was a problem hiding this comment.
SelectLXDHostWithMaasClient creates its own context.Background() and uses it for MAAS API calls, so caller cancellation/timeouts (e.g., controller reconcile ctx) are ignored. Consider accepting a ctx parameter (or threading ctx through SelectOptions) and using that instead of context.Background().
| func SelectLXDHostWithMaasClient(client lxdHostSelectorClient, hosts []maasclient.VMHost, opts SelectOptions) (maasclient.VMHost, error) { | |
| log := textlogger.NewLogger(textlogger.NewConfig()) | |
| ctx := context.Background() | |
| func SelectLXDHostWithMaasClient(ctx context.Context, client lxdHostSelectorClient, hosts []maasclient.VMHost, opts SelectOptions) (maasclient.VMHost, error) { | |
| log := textlogger.NewLogger(textlogger.NewConfig()) |
| // lxdHostSelectorClient extends machineGetter with VMHosts for CP distribution counting | ||
| type lxdHostSelectorClient interface { | ||
| machineGetter | ||
| VMHosts() maasclient.VMHosts |
There was a problem hiding this comment.
The lxdHostSelectorClient interface requires VMHosts(), but SelectLXDHostWithMaasClient doesn’t use VMHosts() anywhere in this file. Tightening the interface to only the methods actually used (Machines()) would reduce coupling and simplify fakes/mocks.
| // lxdHostSelectorClient extends machineGetter with VMHosts for CP distribution counting | |
| type lxdHostSelectorClient interface { | |
| machineGetter | |
| VMHosts() maasclient.VMHosts | |
| // lxdHostSelectorClient defines the minimal client methods needed for LXD host selection. | |
| type lxdHostSelectorClient interface { | |
| machineGetter |
| hostPool = host.ResourcePool().Name() | ||
| } | ||
| // 3. Prefer managed host over OOB | ||
| return isManagedHost(a.host) && !isManagedHost(b.host) |
There was a problem hiding this comment.
Host ranking uses sort.Slice (unstable) and the comparator has no deterministic final tie-breaker when cpCount, resourceScore, and managed status are equal. This can lead to non-deterministic host selection across runs/API order changes. Consider adding a final tie-breaker (e.g., host.SystemID()/Name) or using sort.SliceStable to preserve input order for exact ties.
| return isManagedHost(a.host) && !isManagedHost(b.host) | |
| ma, mb := isManagedHost(a.host), isManagedHost(b.host) | |
| if ma != mb { | |
| return ma && !mb | |
| } | |
| // 4. Final deterministic tie-breaker: use SystemID to ensure stable ordering | |
| return a.host.SystemID() < b.host.SystemID() |
| func (h *fakeVMHost) AvailableCores() int { | ||
| if h.availableCores == 0 { | ||
| return 16 // default | ||
| } | ||
| return h.availableCores | ||
| } | ||
| func (h *fakeVMHost) AvailableMemory() int { | ||
| if h.availableMemory == 0 { | ||
| return 32768 // default 32GB | ||
| } | ||
| return h.availableMemory | ||
| } |
There was a problem hiding this comment.
fakeVMHost.AvailableCores/AvailableMemory treat a configured value of 0 as “unset” and replace it with a default. This prevents tests from representing a real host with 0 available resources and can mask resource-filtering bugs. Consider tracking “set” explicitly (e.g., pointers or separate booleans) so 0 can be a valid test value.
| clusterId := s.deriveClusterID() | ||
| if clusterId == "" { | ||
| s.scope.V(1).Info("Could not derive cluster ID for CP tagging", "systemID", systemID) | ||
| return | ||
| } | ||
|
|
||
| clusterTag := maintenance.TagVMClusterPrefix + maintenance.SanitizeID(clusterId) | ||
|
|
There was a problem hiding this comment.
Local variable is named clusterId, but the codebase generally uses ID initialisms in all-caps (e.g., deriveClusterID, systemID). Renaming to clusterID would align with Go initialism conventions and improve consistency.
backport pr: #317