Welcome to my homelab infrastructure repository! This is the central hub for documenting my homelab setup, configurations, and learnings.
Note that this project is not intended for heavy production workloads. In Cambodia, stable electricity and high-speed internet are not guaranteed, so this homelab is designed for learning, experimentation, and personal projects rather than critical applications. So, if you plan to use this as a reference for your own homelab, please consider the limitations of your environment and adjust accordingly.
I intended to release YouTube videos docummenting this journey.
Model: GMKTec NucBox M5 Ultra
CPU: AMD Ryzen 7 7730U
- 8 Cores / 16 Threads
- Base Clock: 2.0 GHz
- Boost Clock: up to 4.5 GHz
- L1 Cache: 512 KB (64KB Γ 8)
- L2 Cache: 4 MB (512KB Γ 8)
- L3 Cache: 16 MB (shared)
- TDP: 15-35W
- Architecture: Zen 3+ (6nm)
Memory: 64 GB DDR4 RAM
- Type: DDR4 SO-DIMM
- Speed: 3200 MHz
- Configuration: Dual Channel
- Maximum Capacity: 64 GB
Storage: 1 TB NVMe SSD
- Type: M.2 2280 NVMe PCIe Gen 3.0
- Read Speed: ~3500 MB/s
- Write Speed: ~3000 MB/s
- TBW: High endurance model
Graphics: AMD Radeon Graphics
- Integrated GPU (Ryzen 7730U)
- Cores: 8 CUs (Compute Units)
- Architecture: RDNA 2
- Max Frequency: 2.0 GHz
- Display Support: Dual 4K@60Hz or Single 8K@30Hz
Operating System:
- Proxmox VE 9.1.1 (Debian-based hypervisor)
- Linux Kernel 6.8+
- ZFS or ext4 filesystem options
- Web UI: https://192.168.100.50:8006
β
Power Efficiency - 15-35W TDP, ideal for 24/7 operation
β
Performance - 8c/16t handles multiple VMs simultaneously
β
Memory - 64GB sufficient for 8+ VMs with headroom
β
Storage - Fast NVMe for quick VM boot and low latency
β
Dual NIC - Network segregation and redundancy
β
Compact - Minimal footprint, quiet operation
β
Cost Effective - Fraction of enterprise server costs
This homelab runs on Proxmox VE and uses Infrastructure as Code (Terraform + Ansible) for reproducible deployments.
| VM | IP | CPU | RAM | Disk | Purpose |
|---|---|---|---|---|---|
| k8s-master | 192.168.100.201 | 2 | 4 GB | 50 GB | Kubernetes control plane (K3s) |
| k8s-worker-1 | 192.168.100.202 | 4 | 14 GB | 150 GB | Kubernetes worker node |
| k8s-worker-2 | 192.168.100.203 | 4 | 14 GB | 150 GB | Kubernetes worker node |
| database-vm | 192.168.100.205 | 4 | 6 GB | 200 GB | Centralized database server |
| app-gateway | 192.168.100.210 | 2 | 2 GB | 20 GB | Nginx reverse proxy |
| monitoring | 192.168.100.220 | 2 | 6 GB | 80 GB | Prometheus, Grafana, Loki |
| n8n | 192.168.100.230 | 2 | 6 GB | 50 GB | n8n workflow automation |
| ci-cd | 192.168.100.240 | 2 | 5 GB | 100 GB | GitHub Actions + Docker registry |
Total Resources: 21 CPU cores, 57 GB RAM, 820 GB storage
Available for Host: 3 cores (37.5%), 7 GB RAM (10.9%), ~180 GB storage
Resource Allocation Notes:
- All VMs have adequate resources for their intended workloads
- CI/CD VM may require additional RAM (8GB+) for heavy Docker builds
- 7GB RAM reserved for Proxmox host ensures system stability
- Storage allocation allows for data growth and logs
- Balanced allocation prevents resource contention
homelab-journey/
βββ terraform/ # Infrastructure provisioning
β βββ vms.tf # VM definitions
β βββ variables.tf # Configurable variables
β βββ terraform.tfvars # Your configuration
βββ ansible/ # Configuration management (organized)
β βββ inventory.ini # VM inventory
β βββ ansible.cfg # Ansible configuration
β βββ README.md # Ansible documentation
β βββ playbooks/ # π Organized playbooks by category
β βββ infrastructure/ # Core infrastructure setup
β β βββ proxmox-setup.yml
β β βββ qemu-agent-setup.yml
β β βββ docker-setup.yml
β β βββ nginx-gateway-setup.yml
β βββ kubernetes/ # K8s cluster deployment
β β βββ k3s-cluster-setup.yml
β βββ services/ # Application services
β β βββ database-setup.yml
β β βββ monitoring-setup.yml
β β βββ monitoring-dashboards-setup.yml
β β βββ n8n-setup.yml
β β βββ cicd-setup.yml
β βββ networking/ # Network & remote access
β βββ cloudflare-tunnel-setup.yml
β βββ github-runner-setup.yml
βββ script/ # Helper scripts
β βββ setup-k8s-complete.sh
β βββ setup-cloudflare-tunnel.sh
β βββ run-monitoring-setup.sh
β βββ run-monitoring-dashboards-setup.sh
β βββ run-n8n-setup.sh
β βββ run-database-setup.sh
β βββ generate-ssh-keys.sh
β βββ cleanup-known-hosts.sh
βββ diagram/ # Infrastructure diagrams
Before starting, ensure you have:
- β Proxmox VE installed and configured
- β
SSH keys generated (run
./generate-ssh-keys.shyou'll find the keys inssh-keys/directory after running the script. This will be used for Ansible access to VMs) - β Ansible installed on your control machine
- β GitHub account (for CI/CD setup)
- β Cloudflare account (for tunnel setup)
- β Domain name configured in Cloudflare DNS
Replace YOUR_SSH_PUBLIC_KEY_HERE in proxmox-setup.yml with your actual SSH public key to enable passwordless SSH access to the VMs.
cd ansible
ansible-playbook playbooks/infrastructure/proxmox-setup.ymlThis playbook will setup basic packages such as fastfetch, htop, lm-sensors, qemu-guest-agent, create cloud-init configuration for terraform to use that add ssh user with your public key.
For now, CPU Type for proxmox_virtual_environment_vm is set to host because We experience MongoDB v8.4 not working (AVX2 not detected) when using X86-64-AES CPU type. We will investigate this further and update the configuration if needed. This can be the issue if we migrate another host with different CPU architecture in the future.
Make sure to create terraform.tfvars with your specific configuration (IP addresses, credentials, etc.) before running Terraform. See Terraform Configuration for details.
cd terraform
terraform init
terraform plan
terraform apply -auto-approveIt might take awhile for qemu-agent to be installed and for the VMs to be fully provisioned and gives signal back to Terraform. You can check the Proxmox Web UI to monitor the VM creation process.
ssh -i path/to/ssh-keys ubuntu@192.168.100.201 # Master node
ssh -i path/to/ssh-keys ubuntu@192.168.100.202 # Worker 1
ssh -i path/to/ssh-keys ubuntu@192.168.100.203 # Worker 2This section provides detailed step-by-step instructions for manually setting up the entire infrastructure.
Before setting up each component, make sure you create an ansible inventory.ini from the provided inventory.ini.example and update the IP addresses and credentials as needed. There is /path/to/your/key which is the path to your private SSH key that matches the public key you added to proxmox-setup.yml for Ansible access.
This will install GitHub Actions runner and Docker registry on the CI/CD VM.
cd ansible
ansible-playbook playbooks/services/cicd-setup.ymlNote that you will need to manually configure the GitHub runner token after installation. See the playbook documentation for instructions.
Run the K3s cluster setup playbook to configure the master and worker nodes:
cd ansible
ansible-playbook playbooks/kubernetes/k3s-cluster-setup.ymlThis playbook will:
- Install K3s on the master node
- Configure worker nodes to join the cluster
- Setup kubectl configuration
- Verify cluster health
Verify:
ssh -i path/to/ssh-keys ubuntu@192.168.100.201 # SSH into master node
kubectl get nodes
# Should show: k8s-master, k8s-worker-1, k8s-worker-2 all ReadyEach k8s worker node will be configured to refer to the registry on the CI/CD VM for pulling images via registry.homelab.local hostname.
Verify registry access:
# From master node
curl -v http://registry.homelab.local/v2/_catalog
# Should return JSON with list of repositories (even if empty)This will install Nginx Proxy Manager on the gateway VM and configure it as a reverse proxy for your Kubernetes services.
# Run interactive setup script
cd ansible
ansible-playbook playbooks/networking/nginx-gateway-setup.ymlThe script will:
- Install Nginx on the gateway VM
- Setup health checks and monitoring
Access the admin UI at http://192.168.100.210:81. You will be asked to create an admin account on first login. Use this UI to configure reverse proxy rules for your Kubernetes services.
This will install PostgreSQL, MySQL, and MongoDB on the database server VM. Create .env from the provided .env.example in the ansible directory and set secure passwords for each database. Then run the prepared script to setup the databases.
chmod ./script/run-database-setup.sh
./script/run-database-setup.shNote that this will load passwords from the .env file. If you run ansible playbook directly, make sure to set the environment variables or edit the playbook to include the passwords.
This will install n8n on the n8n VM and configure it to use the PostgreSQL database on the database server VM.
cd ansible
ansible-playbook playbooks/services/n8n-setup.ymlThis playbook will:
- Install GitHub Actions self-hosted runner
- Install Docker and setup local registry
- Install kubectl and helm for Kubernetes deployments
This will install Prometheus, Grafana, and Loki on the monitoring VM and configure them to monitor your Kubernetes cluster and other services.
cd ansible
ansible-playbook playbooks/services/monitoring-setup.yml
ansible-playbook playbooks/services/monitoring-dashboards-setup.yml# Check node health
kubectl get nodes
kubectl describe node k8s-worker-1
# View all resources
kubectl get all --all-namespaces
# Check cluster component status
kubectl get componentstatuses
# View cluster events
kubectl get events --all-namespaces --sort-by='.lastTimestamp'# List all pods
kubectl get pods -n your-namespace
# View deployment status
kubectl get deployments -n your-namespace
# Check services
kubectl get svc -n your-namespace
# View ingress routes
kubectl get ingress --all-namespaces
# Port forward for local access
kubectl port-forward svc/myapp 8080:80 -n your-namespace# View pod logs
kubectl logs pod-name -n namespace
# Follow logs in real-time
kubectl logs -f deployment/myapp -n namespace
# View logs from previous container instance
kubectl logs pod-name --previous -n namespace
# Get last 100 lines
kubectl logs --tail=100 pod-name -n namespace
# View logs from specific container in multi-container pod
kubectl logs pod-name -c container-name -n namespaceIssue: Applications cannot connect to databases on database-vm
Diagnostic Steps:
# Test database connectivity from K8s pod
kubectl run -it --rm psql-test --image=postgres:16 --restart=Never -- \
psql -h 192.168.100.205 -U postgres -d postgres
# Test from local machine
psql -h 192.168.100.205 -U postgres -d postgres
mysql -h 192.168.100.205 -u root -p
mongosh mongodb://192.168.100.205:27017
# Check database services are running
ssh user@192.168.100.205
sudo systemctl status postgresql
sudo systemctl status mysql
sudo systemctl status mongod
# Check firewall rules
sudo ufw status
# Should show ports 5432, 3306, 27017 allowed
# Verify databases are listening on all interfaces
sudo netstat -tlnp | grep -E '5432|3306|27017'
# Should show 0.0.0.0:PORT, not 127.0.0.1:PORTSolutions:
-
PostgreSQL Not Allowing Remote Connections:
# Edit pg_hba.conf sudo nano /etc/postgresql/16/main/pg_hba.conf # Add: host all all 192.168.100.0/24 scram-sha-256 # Edit postgresql.conf sudo nano /etc/postgresql/16/main/postgresql.conf # Set: listen_addresses = '*' # Restart PostgreSQL sudo systemctl restart postgresql
-
MySQL Binding to Localhost Only:
# Edit MySQL config sudo nano /etc/mysql/mysql.conf.d/mysqld.cnf # Set: bind-address = 0.0.0.0 # Restart MySQL sudo systemctl restart mysql # Grant remote access mysql -u root -p GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'password'; FLUSH PRIVILEGES;
-
MongoDB Not Allowing Remote Connections:
# Edit mongod.conf sudo nano /etc/mongod.conf # Set: bindIp: 0.0.0.0 # Restart MongoDB sudo systemctl restart mongod
Issue: Node or pod consuming excessive CPU/memory
Diagnostic Steps:
# Check node resource usage
kubectl top nodes
# Check pod resource usage
kubectl top pods --all-namespaces --sort-by=memory
kubectl top pods --all-namespaces --sort-by=cpu
# View detailed node metrics
kubectl describe node node-name
# Check for resource limits
kubectl describe deployment deployment-name -n namespace | grep -A 5 LimitsSolutions:
-
Adjust Resource Limits:
resources: requests: memory: "256Mi" cpu: "200m" limits: memory: "512Mi" cpu: "500m"
-
Scale Down Non-Critical Pods:
kubectl scale deployment/myapp --replicas=1 -n namespace
-
Restart Problematic Pods:
kubectl rollout restart deployment/myapp -n namespace
-
Check for Memory Leaks:
# Monitor pod over time watch kubectl top pod pod-name -n namespace # View Grafana dashboards for trends # http://192.168.100.220:3000
Issue: Pods cannot communicate with each other or external services
Diagnostic Steps:
# Test pod-to-pod communication
kubectl exec -it pod1 -n namespace -- ping pod2-ip
# Test external connectivity
kubectl exec -it pod-name -n namespace -- ping 8.8.8.8
kubectl exec -it pod-name -n namespace -- curl https://google.com
# Check CNI plugin (Flannel in K3s)
kubectl get pods -n kube-system -l app=flannel
# Verify network policies (if any)
kubectl get networkpolicies --all-namespaces
# Check iptables rules (on nodes)
ssh user@node-ip
sudo iptables -L -n -vSolutions:
- Restart affected pods
- Restart CNI plugin pods
- Check K3s service on nodes:
sudo systemctl status k3sorsudo systemctl status k3s-agent - Verify routing:
ip route show
-
Always check events first:
kubectl get events -n namespace --sort-by='.lastTimestamp' -
Use describe for detailed info:
kubectl describe <resource-type> <resource-name> -n namespace
-
Check logs systematically:
- Application logs:
kubectl logs - System logs:
journalctl - Service logs:
systemctl status
- Application logs:
-
Isolate the issue:
- Test from multiple points (local, same node, different node)
- Simplify (reduce to minimal reproduction case)
- Compare with working examples
-
Use debug containers:
# Run temporary debugging pod kubectl run debug -it --rm --image=nicolaka/netshoot --restart=Never -- bash -
Document and version:
- Keep notes of issues and solutions
- Track configuration changes in Git
- Use Git tags for stable states
All playbooks are organized in ansible/playbooks/ by functional category:
- ποΈ Infrastructure - Proxmox, Docker, Nginx, QEMU agent (4 playbooks)
- βΈοΈ Kubernetes - K3s cluster deployment (1 playbook)
- π Services - Databases, monitoring, n8n, CI/CD (5 playbooks)
- π Networking - Cloudflare tunnel, GitHub runner (2 playbooks)
This repository follows Infrastructure as Code principles:
- All infrastructure defined in code (Terraform + Ansible + Kubernetes)
- Version controlled and reproducible
- Automated CI/CD with GitHub Actions
- Secure remote access via Cloudflare Tunnel
- Documented for learning and future reference
- Designed for homelab experimentation and learning
