KubeNodeSmith is a Kubernetes autoscaler built with Kubebuilder. It watches for unschedulable pods, provisions new worker nodes from infrastructure providers (Proxmox, Redfish, etc.), and deprovisions them when idle. Built on controller-runtime, it provides declarative node lifecycle management through custom resources.
🛠️ Active development. Architecture follows industry patterns similar to Karpenter's NodeClaim model.
KubeNodeSmith uses four custom resources to manage autoscaling:
- NodeSmithControlPlane – Top-level controller config that references which pools to manage
- NodeSmithPool – Defines node pools with min/max limits, machine templates, and scale policies
- NodeSmithProvider – Infrastructure provider configuration (Proxmox credentials, VM settings, etc.)
- NodeSmithClaim – Individual node requests with lifecycle tracking (Launched → Registered → Initialized → Ready)
Three controllers reconcile these resources:
ControlPlaneReconciler– Manages provider initialization and pool coordinationNodePoolReconciler– Handles scale-up/down decisions based on unschedulable pods and pool limitsNodeClaimReconciler– Provisions machines, waits for node registration, and handles cleanup
flowchart LR
Pod[Unschedulable Pod] --> NodePool[NodePoolReconciler]
NodePool --> Claim[NodeSmithClaim]
Claim --> NodeClaim[NodeClaimReconciler]
NodeClaim --> Provider[Provider API]
Provider --> VM[VM Created]
VM --> Node[Node Registers]
Node --> Ready[Claim Ready]
Scale-up lifecycle:
- Unschedulable pod detected → NodePoolReconciler creates NodeSmithClaim
- NodeClaimReconciler provisions VM via provider
- VM boots via DHCP netboot, cloud-init injects cluster credentials
- Node registers with cluster → Claim status: Launched → Registered → Ready
Scale-down lifecycle:
- Node becomes idle → NodePoolReconciler deletes Claim
- NodeClaimReconciler cordons, drains, and deprovisions VM
When a NodeSmithClaim is created, the following happens:
sequenceDiagram
participant NC as NodeClaimReconciler
participant P as Provider (Proxmox)
participant VM as Virtual Machine
participant DHCP as DHCP/PXE Server
participant K8s as Kubernetes API
NC->>P: ProvisionMachine(spec)
P->>VM: Create VM (net0 boot)
VM->>DHCP: PXE boot request
DHCP->>VM: Network boot image
VM->>VM: Cloud-init runs
Note over VM: Injects cluster join token<br/>from cloud-init metadata
VM->>K8s: kubelet registers node
K8s->>NC: Node appears in API
NC->>NC: Update Claim: Ready
Key requirements for nodes:
- DHCP/PXE server provides network boot capability
- Cloud-init metadata contains cluster bootstrap credentials (join token, CA cert)
- Base image must auto-register with the cluster using the configured node name prefix
- Provider configures VMs to boot from network interface (e.g.,
boot=order=net0)
cmd/main.go– Kubebuilder-generated entrypoint that starts the controller managerapi/v1alpha1/– CRD type definitions for the four resource kindsinternal/controller/– Controller reconciliation logic for ControlPlane, NodePool, and NodeClaiminternal/kube/– Kubernetes API helpers (listing pods, nodes, resource calculations)internal/provider/– Provider interface + Proxmox implementationconfig/– Kubebuilder-managed kustomize configs, RBAC, CRD manifestsmanifests/– Deployment YAML, sample resources, and test workloads
- Kubernetes cluster (1.25+) with cluster-admin access for CRD installation
- Infrastructure provider:
- Proxmox: API token with VM creation/deletion rights; cluster resource inspection
- Kubernetes secret with
PROXMOX_TOKEN_IDandPROXMOX_SECRETkeys
- Kubernetes secret with
- Other providers: See
internal/provider/for interface implementation
- Proxmox: API token with VM creation/deletion rights; cluster resource inspection
- Base image: Netboot-capable image that automatically joins the cluster on boot
- Must register with kubelet using the expected node name format
- Example: NixOS with embedded cluster bootstrap configuration
Using Nix? Drop into the dev shell and run the controller manager locally:
nix develop
make install # Install CRDs to your cluster
make run # Run controller manager locallyThe controller manager will:
- Read kubeconfig from
~/.kube/config(or use in-cluster config when deployed) - Start three reconciliation controllers (ControlPlane, NodePool, NodeClaim)
- Watch for changes to NodeSmith resources and unschedulable pods
- Use leader election when running multiple replicas (disabled by default for local dev)
Configure your resources:
Edit config/samples/ to match your infrastructure, then apply:
kubectl create namespace kubenodesmith
kubectl create secret generic proxmox-api-secret \
--namespace kubenodesmith \
--from-literal=PROXMOX_TOKEN_ID='user@pam!tokenid' \
--from-literal=PROXMOX_SECRET='your-secret-here'
kubectl apply -k config/samples/Optional flags for make run or go run ./cmd/main.go:
--health-probe-bind-address=:8081– Health/readiness probes (default)--leader-elect– Enable leader election for HA deployments
-
Create namespace and load credentials:
kubectl create namespace kubenodesmith kubectl create secret generic proxmox-api-secret \ --namespace kubenodesmith \ --from-literal=PROXMOX_TOKEN_ID='user@pam!tokenid' \ --from-literal=PROXMOX_SECRET='your-secret-here'
-
Install CRDs:
make install
-
Configure resources for your environment: Edit the sample files in
config/samples/to match your infrastructure, then apply:kubectl apply -k config/samples/
-
Deploy the controller manager:
make docker-build docker-push IMG=<your-registry>/kubenodesmith:tag make deploy IMG=<your-registry>/kubenodesmith:tag
-
Verify deployment:
kubectl get pods -n kubenodesmith-system kubectl logs -n kubenodesmith-system -l control-plane=controller-manager -f
Health checks:
http://<pod>:8081/healthzandhttp://<pod>:8081/readyz
KubeNodeSmith uses CRDs under kubenodesmith.parawell.cloud/v1alpha1. See config/samples/ for complete examples:
Defines infrastructure provider settings (Proxmox, Redfish, etc.):
apiVersion: kubenodesmith.parawell.cloud/v1alpha1
kind: NodeSmithProvider
metadata:
name: proxmox-production
namespace: kubenodesmith
spec:
type: proxmox
credentialsSecretRef:
name: proxmox-api-secret
namespace: kubenodesmith
proxmox:
endpoint: https://10.0.4.30:8006/api2/json
nodeWhitelist: [alfaromeo, porsche]
vmIDRange: { lower: 1250, upper: 1300 }
managedNodeTag: kubenodesmith-managed
vmMemOverheadMiB: 2048
networkInterfaces:
- { name: net0, model: virtio, bridge: vmbr0, vlanTag: 20, macPrefix: "02:00:00" }
vmOptions:
- { name: cpu, value: host }
- { name: boot, value: order=net0 }
# ... see config/samples/kubenodesmith_v1alpha1_nodesmithprovider.yaml for full listDefines a node pool with capacity limits and scaling policies:
apiVersion: kubenodesmith.parawell.cloud/v1alpha1
kind: NodeSmithPool
metadata:
name: proxmox-small
namespace: kubenodesmith
spec:
providerRef: proxmox-production
limits:
minNodes: 0
maxNodes: 5
cpuCores: 0 # 0 = unlimited
memoryMiB: 30720 # aggregate memory ceiling
machineTemplate:
labels:
node-role.kubernetes.io/worker: ""
scaleUp:
stabilizationWindow: 2m
scaleDown:
stabilizationWindow: 5mTop-level controller that manages pools:
apiVersion: kubenodesmith.parawell.cloud/v1alpha1
kind: NodeSmithControlPlane
metadata:
name: kubenodesmith
namespace: kubenodesmith
spec:
pools: [proxmox-small] # List of NodeSmithPool names to manageRepresents an individual node request with lifecycle status:
apiVersion: kubenodesmith.parawell.cloud/v1alpha1
kind: NodeSmithClaim
metadata:
name: proxmox-small-abc123
namespace: kubenodesmith
spec:
poolRef: proxmox-small
requirements:
cpuCores: 4
memoryMiB: 8192
status:
conditions:
- type: Launched
status: "True"
- type: Registered
status: "True"
- type: Ready
status: "True"
providerID: proxmox://cluster/vms/1251
nodeName: zagato-worker-auto-1Key notes:
- All resources typically live in the same namespace
- The ControlPlane controller watches pods cluster-wide for unschedulable conditions
- NodeClaims are automatically created by the NodePool controller and cleaned up when nodes are removed
- Pool labels are added automatically:
topology.kubenodesmith.io/pool: <pool-name>
- Poll: list pending pods with the “Unschedulable” condition.
- Scale up: if there’s at least one, figure out the pod’s resource requests, check the pool limits, and ask the provider for a VM big enough to host it.
- Wait: watch Kubernetes until the node registers (timeout 5 min), then add any configured labels.
- Scale down: when there are no unschedulable pods, look for idle nodes in the pool (no evictable pods). Cordons them, calls the provider to delete the VM, and repeats.
All Kubernetes interactions live in internal/kube.go; provider calls go through the internal/provider interface so additional backends can plug in later.
- Deploy one of the sample workloads:
kubectl apply -f config/samples/test-workloads/echoserver.yaml
- Watch the scaler logs for a new node request and make sure the pod lands on it.
- Delete the workload and confirm the node is eventually cordoned and removed.
This project follows standard Kubebuilder patterns. When opening a PR:
- Describe your testing approach
- Note any infrastructure assumptions
- Update CRD documentation if adding fields
- Run
make testandmake manifestsbefore committing
See config/ for generated manifests and RBAC rules.
Released under the MIT License.