Skip to content

PCP-6077: vm placement selection with resource and distribution aware (#317)#321

Merged
AmitSahastra merged 1 commit intospectro-release-4.8from
cp/lxd-placement
Feb 28, 2026
Merged

PCP-6077: vm placement selection with resource and distribution aware (#317)#321
AmitSahastra merged 1 commit intospectro-release-4.8from
cp/lxd-placement

Conversation

@kpiyush17
Copy link
Copy Markdown

@kpiyush17 kpiyush17 commented Feb 28, 2026

backport pr: #317

@AmitSahastra AmitSahastra merged commit 756904b into spectro-release-4.8 Feb 28, 2026
6 checks passed
@AmitSahastra AmitSahastra deleted the cp/lxd-placement branch February 28, 2026 08:59
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds resource-, tag-, and control-plane distribution–aware LXD VM host selection in the MAAS provider, and introduces control-plane VM tagging to support anti-affinity and maintenance workflows.

Changes:

  • Replace LXD host selection with a filter-then-rank selector (zone/pool/tags/min resources + CP anti-affinity + managed-host tie-break).
  • Tag control-plane VMs with cluster identity tags to enable distribution-aware placement and maintenance operations.
  • Update MAAS client dependency version and expand unit tests for host selection logic.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
pkg/maas/machine/machine.go Builds new host-selection options (zone/pool/tags/resources + cluster ID) and adds CP VM tagging + safer zone extraction.
pkg/maas/lxd/host_maas_client.go Implements new LXD host selection algorithm with filtering and ranking (including CP distribution + resource scoring).
pkg/maas/lxd/host_maas_client_test.go Adds/updates extensive unit tests and fakes for new selector behaviors (tags/resources/maintenance/anti-affinity).
go.mod Changes MAAS client module version.
go.sum Updates checksums to match dependency changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +300 to +302
func SelectLXDHostWithMaasClient(client lxdHostSelectorClient, hosts []maasclient.VMHost, opts SelectOptions) (maasclient.VMHost, error) {
log := textlogger.NewLogger(textlogger.NewConfig())
ctx := context.Background()
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SelectLXDHostWithMaasClient creates its own context.Background() and uses it for MAAS API calls, so caller cancellation/timeouts (e.g., controller reconcile ctx) are ignored. Consider accepting a ctx parameter (or threading ctx through SelectOptions) and using that instead of context.Background().

Suggested change
func SelectLXDHostWithMaasClient(client lxdHostSelectorClient, hosts []maasclient.VMHost, opts SelectOptions) (maasclient.VMHost, error) {
log := textlogger.NewLogger(textlogger.NewConfig())
ctx := context.Background()
func SelectLXDHostWithMaasClient(ctx context.Context, client lxdHostSelectorClient, hosts []maasclient.VMHost, opts SelectOptions) (maasclient.VMHost, error) {
log := textlogger.NewLogger(textlogger.NewConfig())

Copilot uses AI. Check for mistakes.
Comment on lines +262 to +265
// lxdHostSelectorClient extends machineGetter with VMHosts for CP distribution counting
type lxdHostSelectorClient interface {
machineGetter
VMHosts() maasclient.VMHosts
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lxdHostSelectorClient interface requires VMHosts(), but SelectLXDHostWithMaasClient doesn’t use VMHosts() anywhere in this file. Tightening the interface to only the methods actually used (Machines()) would reduce coupling and simplify fakes/mocks.

Suggested change
// lxdHostSelectorClient extends machineGetter with VMHosts for CP distribution counting
type lxdHostSelectorClient interface {
machineGetter
VMHosts() maasclient.VMHosts
// lxdHostSelectorClient defines the minimal client methods needed for LXD host selection.
type lxdHostSelectorClient interface {
machineGetter

Copilot uses AI. Check for mistakes.
hostPool = host.ResourcePool().Name()
}
// 3. Prefer managed host over OOB
return isManagedHost(a.host) && !isManagedHost(b.host)
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Host ranking uses sort.Slice (unstable) and the comparator has no deterministic final tie-breaker when cpCount, resourceScore, and managed status are equal. This can lead to non-deterministic host selection across runs/API order changes. Consider adding a final tie-breaker (e.g., host.SystemID()/Name) or using sort.SliceStable to preserve input order for exact ties.

Suggested change
return isManagedHost(a.host) && !isManagedHost(b.host)
ma, mb := isManagedHost(a.host), isManagedHost(b.host)
if ma != mb {
return ma && !mb
}
// 4. Final deterministic tie-breaker: use SystemID to ensure stable ordering
return a.host.SystemID() < b.host.SystemID()

Copilot uses AI. Check for mistakes.
Comment on lines +79 to +90
func (h *fakeVMHost) AvailableCores() int {
if h.availableCores == 0 {
return 16 // default
}
return h.availableCores
}
func (h *fakeVMHost) AvailableMemory() int {
if h.availableMemory == 0 {
return 32768 // default 32GB
}
return h.availableMemory
}
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fakeVMHost.AvailableCores/AvailableMemory treat a configured value of 0 as “unset” and replace it with a default. This prevents tests from representing a real host with 0 available resources and can mask resource-filtering bugs. Consider tracking “set” explicitly (e.g., pointers or separate booleans) so 0 can be a valid test value.

Copilot uses AI. Check for mistakes.
Comment on lines +1117 to +1124
clusterId := s.deriveClusterID()
if clusterId == "" {
s.scope.V(1).Info("Could not derive cluster ID for CP tagging", "systemID", systemID)
return
}

clusterTag := maintenance.TagVMClusterPrefix + maintenance.SanitizeID(clusterId)

Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Local variable is named clusterId, but the codebase generally uses ID initialisms in all-caps (e.g., deriveClusterID, systemID). Renaming to clusterID would align with Go initialism conventions and improve consistency.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants