Skip to content

Multi-node hybrid homelab infrastructure (RPi + x86 + Cloud VPS). Features distributed observability, ZRAM resource optimization, and automated disaster recovery via Bash & Docker.

License

Notifications You must be signed in to change notification settings

tparzonka/homelab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🌐 Distributed Hybrid Homelab: Automation, Observability & Redundancy

Reference Architecture: A high-availability, resource-optimized distributed infrastructure spanning a local powerhouse server, an ARM-based edge node, and a cloud-based external watchdog. Designed to simulate production environments on legacy hardware.

Status OS Nodes Architecture


🏗️ Architecture Overview

The infrastructure is strategically split across three distinct nodes to ensure high availability, efficient resource allocation, and a "Dead Man's Switch" monitoring capability.

1. acer-node (Primary Worker - Poland)

Hardware: Acer Switch 10 (Intel Atom, 2GB RAM, 500GB HDD)
The "Heavy Lifter" node handling resource-intensive applications.

  • Automation Hub: Self-hosted n8n instance for complex business & system workflows.
  • Media Stack: Automated management (Jellyfin, Radarr, Bazarr, qBittorrent).
  • Private Cloud: Secure file synchronization via Syncthing.

2. rpi-node (Edge Node - Poland)

Hardware: Raspberry Pi 3 (ARM, 1GB RAM)
The "Network Core" node handling essential lightweight services.

  • Network Security: Pi-hole for network-wide DNS-level ad-blocking.
  • Monitoring: Uptime Kuma for real-time service availability tracking.
  • Dashboard: Homer as a centralized landing page for the infrastructure.

3. vps-finland (External Watchdog - Helsinki)

Hardware: Cloud VPS (256MB RAM)
The "Guardian" node hosted in Helsinki, Finland.

  • Distributed Monitoring: A lightweight "Dead Man's Switch" watchdog. Using a ultra-low resource environment (256MB RAM) to host critical monitoring scripts, ensuring 100% uptime with minimal overhead.

🧠 Engineering Challenges & Solutions

1. Extreme Resource Optimization (Legacy Hardware)

Problem: Running modern Docker stacks on 1GB/2GB RAM nodes leads to OOM (Out Of Memory) crashes.
Solution:

  • Implemented ZRAM (LZ4 compression) and tuned Kernel Swappiness (vm.swappiness=5) to handle memory spikes without killing the SD card.
  • On-Demand Service Lifecycle: Developed a custom Bash manager that starts heavy services (Radarr/Bazarr) via Telegram triggers and automatically shuts them down after 60 mins of inactivity to free up RAM.

2. Security & Zero-Trust Networking

Problem: Need for remote access without exposing the home network via insecure Port Forwarding.
Solution:

  • No Open Ports: Eliminated all port forwarding on the router.
  • Encrypted Tunnels: Utilized WireGuard-based P2P tunnels for secure cross-node communication and administrative access.

3. Proactive Observability & Alerting

Problem: Distributed systems can fail "silently" if only local monitoring is used.
Solution:

  • Built a Multi-Node Health Check system. Custom Bash scripts monitor CPU temp, RAM, and Disk health.
  • Integrated a Telegram Bot API as a central message bus for hardware thresholds, service downtime, and automated backup confirmations.

4. Intelligent Disaster Recovery

Problem: Frequent SD card failures on Raspberry Pi and potential data loss on legacy drives.
Solution:

  • Developed an Identity-Based Backup Strategy using rclone and tar.
  • The logic triggers delta-syncs to Google Drive only if configuration changes (.yml, .sh, .conf) are detected within a 24-hour window, minimizing bandwidth and wear.

📂 Project Structure

├── acer-node/                # Primary Server (x86)
│   ├── media/                # Media stack (Docker Compose)
│   ├── n8n/                  # Automation engine (Docker Compose)
│   ├── acer_backup.sh        # Advanced cloud backup logic
│   ├── get_stats.sh          # System metrics collector
│   └── health_check.sh       # Hardware watchdog & Telegram alerts
│
├── rpi-node/                 # Edge Node (Raspberry Pi / ARM)
│   ├── homer/                # Central Dashboard
│   ├── pihole/               # Network-wide DNS protection
│   ├── uptime/               # Service availability monitoring
│   ├── backup_configs.sh     # Lightweight config sync script
│   └── health_check.sh       # RPi-specific hardware monitoring
│
└── vps-finland/              # External Cloud Node
    └── watchdog/
        └── check_poland.sh   # Distributed "Dead Man's Switch" logic

🤝 Transparency: Human vs. AI Role

This project was built using an AI-Augmented Systems Engineering methodology. The collaboration was structured to simulate a professional DevOps environment where strategic oversight meets technical automation.

  • Lead Implementation Engineer (Me):

    • Strategic Vision: Defined infrastructure goals, handled strict hardware constraints (legacy 1GB/2GB RAM devices), and prioritized services based on resource availability.
    • Hardware Orchestration: Physical setup and integration of the Raspberry Pi 3B, the Acer Switch 10 (Intel Atom), and the external storage arrays.
    • Hands-on Execution: Manual deployment of Docker stacks, terminal-level troubleshooting, and system-level repairs (including EXT4 filesystem recovery).
    • Critical Decision Making: Evaluated and selected tools (e.g., opting for Alpine Linux on VPS) to minimize overhead.
  • Technical Architect & Advisor (AI):

    • Blueprinting: Provided optimized Docker Compose templates and advised on system-level tuning parameters (ZRAM, Swappiness).
    • Scripting Support: Assisted in developing custom Bash scripts for hardware monitoring, "Dead Man's Switch" logic, and the "On-Demand" service lifecycle manager.
    • Troubleshooting Consultant: Acted as a deep-level technical resource for resolving complex Linux errors (UUID mounting issues, Docker PID pipe EOF errors).

🚀 Technology Stack

  • Containerization: Docker, Docker Compose
  • Operating Systems: Debian (x86), Raspberry Pi OS (ARM), Alpine Linux (Cloud)
  • Scripting & Automation: Bash (System Logic), n8n (Workflow Automation)
  • Monitoring & Observability: Uptime Kuma, Pi-hole, Telegram Bot API
  • Cloud & Storage: Rclone (Google Drive), Mikr.us (External VPS)

🔒 Security & Privacy Note

This repository serves as a Sanitized Reference Implementation. For security reasons:

  • All sensitive data (API Tokens, Passwords, IP Addresses) has been replaced with placeholders like [YOUR_SECURE_TOKEN].
  • In production, these variables are managed via encrypted .env files and secret managers, which are excluded from version control.
  • The architecture follows a "No Port Forwarding" policy to minimize the attack surface of the home network.

About

Multi-node hybrid homelab infrastructure (RPi + x86 + Cloud VPS). Features distributed observability, ZRAM resource optimization, and automated disaster recovery via Bash & Docker.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages