Skip to content

codecreationlabs/cluster-cli

Repository files navigation

Internal Cluster Management

A lightweight, distributed cluster management system built in Go for managing 100+ Ubuntu/Debian VMs in Proxmox environments. Features real-time node monitoring, file distribution across nodes, and a professional dark-mode dashboard accessible from any node.

Features

  • Zero-Configuration Join: Bootstrap cluster on one node, get a join token for all others
  • Distributed Architecture: Built on etcd for reliable cluster state management
  • Node Management: Join/leave cluster, real-time health monitoring with heartbeats
  • File Distribution: Upload, download, and manage files across all nodes (perfect for certificate distribution)
  • Web Dashboard: Professional dark-mode UI accessible from any node on port 8080
  • CLI Tools: Comprehensive command-line interface for all operations
  • Labels & Tagging: Organize nodes with custom labels (e.g., role=database, env=production)

Quick Start

1. Setup etcd (One-Time, on Management Node)

# Install etcd
ETCD_VER=v3.6.5
wget https://github.com/etcd-io/etcd/releases/download/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz
tar xzvf etcd-${ETCD_VER}-linux-amd64.tar.gz
sudo mv etcd-${ETCD_VER}-linux-amd64/etcd* /usr/local/bin/

# Start etcd
etcd --listen-client-urls http://0.0.0.0:2379 \
     --advertise-client-urls http://YOUR_IP:2379

2. Build and Install

# Build the agent
go build -o bin/cluster-agent ./cmd/agent

# Or use Make
make build

# Copy to all nodes
scp bin/cluster-agent root@node:/usr/local/bin/

3. Initialize Cluster on First Node

# On your first node (e.g., your cert management VM)
cluster-agent init --etcd http://etcd-server-ip:2379

This will output something like:

╔════════════════════════════════════════════════════════════════╗
║              Cluster Initialized Successfully!                 ║
╚════════════════════════════════════════════════════════════════╝

Node Information:
  ID:        abc123...
  Name:      cert-manager
  IP:        192.168.1.100
  Port:      8080
  Dashboard: http://192.168.1.100:8080

╔════════════════════════════════════════════════════════════════╗
║  Run this command on other nodes to join the cluster:         ║
╚════════════════════════════════════════════════════════════════╝

  cluster-agent daemon eyJldGNkX2VuZHBvaW50cyI6WyIxOTIuMTY4LjEuMToyMzc5Il0sImNsdXN0ZXJfaWQiOiJjbHVzdGVyLWFiYzEyMyJ9

4. Join Other Nodes

Copy the command from step 3 and run it on each of your other 99 nodes:

# On nodes 2-100
cluster-agent daemon eyJldGNkX2VuZHBvaW50cyI6WyIxOTIuMTY4LjEuMToyMzc5Il0sImNsdXN0ZXJfaWQiOiJjbHVzdGVyLWFiYzEyMyJ9

That's it! All nodes are now in the cluster with the dashboard running.

Usage

Node Management

List all nodes:

cluster-agent list --token <your-join-token>

Leave cluster:

cluster-agent leave --token <your-join-token>

File Management

Upload a file (e.g., SSL certificate):

cluster-agent file upload /path/to/cert.pem node1:8080

List files in cluster:

cluster-agent file list --token <your-join-token>

Download file from a node:

cluster-agent file download <file-id> node1:8080 /local/path/cert.pem

Example: Distribute Certificates

# On your certificate management VM
cluster-agent file upload /etc/ssl/certs/my-app.crt 192.168.1.100:8080

# The file is now tracked in the cluster
cluster-agent file list --token <token>

# On each target node, download the certificate
cluster-agent file download <file-id> 192.168.1.100:8080 /etc/ssl/certs/my-app.crt

Dashboard

Access the dashboard from any node at http://node-ip:8080

Features:

  • Real-time node status (updates every 5 seconds)
  • Online/offline nodes visualization
  • Node details (IP, labels, last seen)
  • Professional dark mode interface
  • No external dependencies

Systemd Service

Create /etc/systemd/system/cluster-agent.service:

[Unit]
Description=Cluster Management Agent
After=network.target

[Service]
Type=simple
User=root
Environment="CLUSTER_TOKEN=eyJldGNkX2VuZHBvaW50cyI6WyIxOTIuMTY4LjEuMToyMzc5Il0sImNsdXN0ZXJfaWQiOiJjbHVzdGVyLWFiYzEyMyJ9"
ExecStart=/usr/local/bin/cluster-agent daemon $CLUSTER_TOKEN --labels role=app,env=prod
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

Enable and start:

sudo systemctl daemon-reload
sudo systemctl enable cluster-agent
sudo systemctl start cluster-agent
sudo systemctl status cluster-agent

Architecture

┌─────────────────────────────────────────────────────────┐
│                     etcd Cluster                         │
│              (Distributed State Store)                   │
└─────────────────────────────────────────────────────────┘
                        ▲  ▲  ▲
                        │  │  │
        ┌───────────────┘  │  └──────────────┐
        │                  │                  │
        ▼                  ▼                  ▼
┌─────────────┐   ┌─────────────┐   ┌─────────────┐
│   Node 1    │   │   Node 2    │   │   Node N    │
│ (init node) │   │             │   │             │
│  Agent      │   │  Agent      │   │  Agent      │
│  + API      │   │  + API      │   │  + API      │
│  + Dashboard│   │  + Dashboard│   │  + Dashboard│
│  :8080      │   │  :8080      │   │  :8080      │
└─────────────┘   └─────────────┘   └─────────────┘

Components

  1. etcd: Distributed key-value store (3 nodes recommended for HA)
  2. Agent: Runs on each node, manages cluster membership and serves dashboard
  3. Dashboard: Web UI for monitoring (embedded in agent)
  4. File Manager: Handles file distribution across the cluster

Join Token

The join token is a base64-encoded JSON containing:

  • etcd endpoints
  • Cluster ID

Example decoded token:

{
  "etcd_endpoints": ["192.168.1.1:2379"],
  "cluster_id": "cluster-abc123"
}

API Reference

The agent exposes a REST API on port 8080:

Nodes

  • GET /api/nodes - List all nodes
  • GET /api/nodes/{id} - Get node details
  • GET /api/health - Health check

Files

  • GET /api/files - List all files
  • POST /api/files - Upload a file (multipart/form-data)
  • GET /api/files/{id} - Get file metadata
  • GET /api/files/{id}/download - Download file
  • DELETE /api/files/{id} - Delete file

Advanced Configuration

High Availability etcd

For production, run etcd in cluster mode (3 or 5 nodes):

# Node 1
etcd --name node1 \
  --initial-advertise-peer-urls http://10.0.0.1:2380 \
  --listen-peer-urls http://0.0.0.0:2380 \
  --listen-client-urls http://0.0.0.0:2379 \
  --advertise-client-urls http://10.0.0.1:2379 \
  --initial-cluster node1=http://10.0.0.1:2380,node2=http://10.0.0.2:2380,node3=http://10.0.0.3:2380

Then bootstrap with all etcd nodes:

cluster-agent init --etcd http://10.0.0.1:2379,http://10.0.0.2:2379,http://10.0.0.3:2379

Custom Node Labels

# Add labels during init
cluster-agent init --labels role=cert-manager,tier=management,env=prod

# Add labels when joining
cluster-agent daemon <token> --labels role=database,tier=data,env=prod

Custom Port

# Use different port (default is 8080)
cluster-agent init --port 9090
cluster-agent daemon <token> --port 9090

Troubleshooting

Agent won't start:

  • Check etcd is running: curl http://etcd-server:2379/health
  • Verify token is correct
  • Check firewall allows ports 8080 and 2379
  • Check logs: journalctl -u cluster-agent -f

Node shows offline:

  • Heartbeat might have failed (recovers automatically in 10s)
  • Check network connectivity to etcd
  • Verify node is still running: systemctl status cluster-agent

File transfer fails:

  • Ensure source node is online
  • Check target node can reach source node on port 8080
  • Verify file exists on source node

Invalid join token:

  • Ensure you copied the entire token
  • Token is case-sensitive
  • Re-run init to generate a new token if lost

Development

Project Structure

.
├── cmd/
│   ├── agent/          # Agent CLI
│   └── server/         # Standalone server (optional)
├── pkg/
│   ├── api/            # REST API handlers
│   ├── cluster/        # Cluster management logic
│   │   └── token.go    # Join token generation
│   ├── filemanager/    # File distribution
│   ├── node/           # Node models
│   └── store/          # etcd abstraction
├── web/
│   └── dashboard/      # Web UI (HTML/CSS/JS)
├── Makefile            # Build automation
└── README.md

Building

# Build agent
make agent

# Build server
make server

# Build both
make build

# Clean
make clean

# Run tests
make test

Deployment Script

For manual deployment to multiple nodes:

# Create a nodes list file
cat > nodes.txt <<EOF
192.168.1.10
192.168.1.11
192.168.1.12
EOF

# Get your join token from init
TOKEN="your-token-here"

# Deploy to all nodes
while read node; do
    echo "Deploying to $node..."
    scp bin/cluster-agent root@$node:/usr/local/bin/
    ssh root@$node "cluster-agent daemon $TOKEN &"
done < nodes.txt

License

Proprietary - CodeCreation Labs

Support

For issues or questions, contact your cluster administrator.


Pro Tip: Save your join token! You'll need it to add nodes, list nodes, and for administrative tasks. Store it in a password manager or environment variable.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published