Skip to content

Skanderba8/linux-production-infrastructure

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

30 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

A production-style Linux infrastructure project demonstrating system administration, security hardening, and Infrastructure as Code (IaC) practices using Ansible automation.

Project Status Infrastructure Automation OS


๐Ÿ“‹ Table of Contents


๐ŸŽฏ Project Overview

This project showcases the complete lifecycle of building, securing, and automating a multi-server Linux environment from scratch. It demonstrates real-world DevOps and system administration practices used in production environments.

What This Project Demonstrates

  • Infrastructure as Code (IaC): Everything is reproducible and version-controlled
  • Security First: Multi-layered security approach with automated hardening
  • Automation: Manual tasks converted to reusable Ansible playbooks
  • Service Deployment: Full 3-tier web application stack (Nginx โ†’ Node.js โ†’ PostgreSQL)
  • Monitoring Ready: Metrics collection infrastructure with node_exporter
  • Professional Documentation: Clear, comprehensive, and maintainable

Project Goals

  1. โœ… Build a multi-server Linux environment with proper networking
  2. โœ… Implement security best practices (SSH hardening, firewalls, intrusion prevention)
  3. โœ… Automate everything with Ansible for repeatability
  4. โœ… Deploy production-ready services (web, application, database tiers)
  5. โœ… Implement centralized monitoring and alerting
  6. โœ… Create automated backup and disaster recovery procedures
  7. โœ… Test failure scenarios and validate recovery processes
  8. โœ… Document everything for knowledge transfer

Time Investment: ~30-40 hours (6 phases)
Current Time Spent: ~30 hours


๐Ÿ—๏ธ Infrastructure Architecture

Environment Specifications

  • Hypervisor: VirtualBox 7.x on Windows 11 host
  • Operating System: Linux Mint 22 (based on Ubuntu 24.04 LTS)
  • Network: Dual-adapter setup (NAT + Host-Only)
  • Automation Platform: Ansible 2.16+
  • Version Control: Git / GitHub

Server Topology

Hostname Role NAT IP Host-Only IP vCPU RAM Disk Status
baseline-template Golden Image 10.0.2.10 192.168.56.10 2 2GB 25GB ๐Ÿ”ด Powered Off
control-node Ansible Controller 10.0.2.11 192.168.56.11 2 2GB 25GB ๐ŸŸข Running
web-server Nginx Reverse Proxy 10.0.2.12 192.168.56.12 2 2GB 25GB ๐ŸŸข Running
app-server Node.js Application 10.0.2.13 192.168.56.13 2 2GB 25GB ๐ŸŸข Running
db-server PostgreSQL Database 10.0.2.14 192.168.56.14 2 4GB 50GB ๐ŸŸข Running

Network Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                     Windows 11 Host Machine                       โ”‚
โ”‚                  Your Workstation (SSH Client)                    โ”‚
โ”‚                   192.168.56.1 (Host-Only Gateway)               โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                            โ”‚
                            โ”‚ SSH Access via Host-Only Network
                            โ”‚ (Management & Development)
                            โ”‚
        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
        โ”‚                   โ”‚                   โ”‚
        โ”‚                   โ”‚                   โ”‚
   โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”      โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚ control  โ”‚      โ”‚ web-server  โ”‚     โ”‚ app-server โ”‚
   โ”‚  -node   โ”‚โ—„โ”€โ”€โ”€โ”€โ–บโ”‚   (nginx)   โ”‚โ—„โ”€โ”€โ”€โ–บโ”‚  (node.js) โ”‚
   โ”‚          โ”‚      โ”‚             โ”‚     โ”‚            โ”‚
   โ”‚.56.11    โ”‚      โ”‚  .56.12     โ”‚     โ”‚  .56.13    โ”‚
   โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜      โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜
        โ”‚                   โ”‚                   โ”‚
        โ”‚                   โ”‚                   โ”‚
        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                            โ”‚
                            โ”‚
                      โ”Œโ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”
                      โ”‚ db-server  โ”‚
                      โ”‚(postgresql)โ”‚
                      โ”‚            โ”‚
                      โ”‚  .56.14    โ”‚
                      โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•—
โ•‘       NAT Network (InfraNet - 10.0.2.0/24)                      โ•‘
โ•‘   VM-to-VM Communication & Internet Access                       โ•‘
โ•‘                                                                  โ•‘
โ•‘   control-node: 10.0.2.11    web-server:  10.0.2.12            โ•‘
โ•‘   app-server:   10.0.2.13    db-server:   10.0.2.14            โ•‘
โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
                            โ”‚
                            โ”‚ Internet Access
                            โ–ผ
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚   Internet    โ”‚
                    โ”‚ (via Windows) โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Application Flow

Internet โ†’ [Web Server:80] โ†’ [App Server:3000] โ†’ [Database:5432]
            Nginx Proxy       Express.js API      PostgreSQL
            
Security: Each tier only accepts connections from the previous tier

๐Ÿ“ˆ Project Phases

Phase 1: Manual Base Configuration โœ… COMPLETE

Objective: Build the infrastructure foundation manually to understand every component infra2

Tasks Completed:

  • VirtualBox environment setup with NAT and Host-Only networks
  • Created baseline template VM with dual network adapters
  • Manual security hardening (SSH, firewall, fail2ban)
  • Installed monitoring agent (node_exporter)
  • Configured automatic security updates
  • Cloned and configured 4 production VMs
  • Established hostname resolution via /etc/hosts
  • Verified connectivity and services
  • Created snapshot: Phase1-Complete-Baseline

Time Invested: ~4 hours
Status: โœ… 100% Complete


Phase 2: Automation with Ansible โœ… COMPLETE

Objective: Convert all manual configurations into automated, repeatable Ansible playbooks

Tasks Completed:

Step 2.1: Ansible Control Node Setup โœ…

  • Installed Ansible 2.16+ on control-node
  • Created complete project directory structure
  • Generated SSH keys for Ansible automation
  • Distributed SSH keys to all managed nodes
  • Configured passwordless sudo on all managed nodes
  • Created ansible.cfg with optimized settings
  • Created inventory file with logical host groups
  • Created group_vars/all.yml with global variables
  • Tested Ansible connectivity (ping module)
  • Initialized Git repository
  • Pushed to GitHub repository

Step 2.2: Base Security Hardening Automation โœ…

  • Created 5 security roles:
    • ssh_hardening - SSH security configuration
    • firewall - UFW firewall rules
    • fail2ban - Intrusion prevention system
    • auto_updates - Unattended security patches
    • node_exporter - Prometheus metrics exporter
  • Created base-hardening.yml playbook
  • Successfully executed on all 3 managed nodes
  • Tested and verified idempotency
  • Created verify-config.yml verification playbook
  • All security services running and verified

Step 2.3: Service-Specific Playbooks โœ…

  • Web Server Deployment:

    • Created nginx role with reverse proxy configuration
    • Configured security headers
    • Created web-server.yml playbook
    • Deployed and verified Nginx
    • Opened firewall ports 80, 443
  • Application Server Deployment:

    • Created nodejs_app role
    • Deployed Express.js sample application
    • Configured systemd service (myapp.service)
    • Created app-server.yml playbook
    • Service running and health checks passing
    • Restricted access to web server only
  • Database Server Deployment:

    • Created postgresql role
    • Installed PostgreSQL 16
    • Created application database (appdb)
    • Created database user (appuser)
    • Configured network access from app server
    • Created db-server.yml playbook
    • Database accessible and verified

Step 2.4: End-to-End Testing โœ…

  • Created comprehensive verification playbook
  • Tested Web โ†’ App connectivity
  • Tested App โ†’ Database connectivity
  • Verified full request flow (end-to-end)
  • All services passing health checks

Deliverables Completed:

  • 8 Ansible roles (reusable components)
  • 6 Ansible playbooks (automation scripts)
  • Complete 3-tier application stack
  • Full security hardening
  • Monitoring foundation

Time Invested: ~11 hours
Status: โœ… 100% Complete
Snapshot: Phase2-Complete-Full-Automation (ready to create)


Phase 3: Centralized Monitoring โœ… COMPLETE

Objective: Implement Prometheus and Grafana for infrastructure monitoring dashboard1 dashboard2 Tasks Completed:

Step 3.1: Prometheus Deployment โœ…

  • Created Prometheus role (roles/prometheus/)
  • Installed Prometheus 3.9.1 from GitHub releases
  • Created Prometheus system user and directories
  • Configured Prometheus to scrape all 4 node_exporters
  • Set up systemd service for Prometheus
  • Configured scraping targets:
    • control-node: 10.0.2.11:9100
    • web-server: 10.0.2.12:9100
    • app-server: 10.0.2.13:9100
    • db-server: 10.0.2.14:9100
  • Configured firewall to allow port 9090 from host-only network
  • Verified Prometheus is running and healthy

Step 3.2: Grafana Deployment โœ…

  • Created Grafana role (roles/grafana/)
  • Installed Grafana 12.3.2 from official repository
  • Configured Grafana to run on port 3001
  • Set default admin password: admin123!
  • Configured Prometheus as default data source
  • Configured firewall to allow port 3001 from host-only network
  • Created provisioning for automatic data source configuration
  • Verified Grafana is running and accessible

Step 3.3: Dashboard Implementation โœ…

  • Created monitoring dashboards via Grafana API
  • Imported and tested dashboard templates
  • Created working dashboards with proven queries:
    • "System Monitoring Dashboard" (comprehensive metrics)
    • "SIMPLE TEST - RAW METRICS" (debug dashboard)
    • "GUARANTEED WORKING - TABLE VIEW" (table format)
    • "GUARANTEED WORKING - STAT VIEW" (stat panels)
  • Tested all metrics are being collected and displayed

Step 3.4: Playbook Development โœ…

  • Created monitoring.yml playbook for stack deployment
  • Created verify-monitoring.yml for validation
  • Created open-monitoring-ports.yml for firewall configuration
  • Tested idempotency of all playbooks
  • Documented access URLs and credentials

Deliverables Completed:

  • โœ… Centralized monitoring with Prometheus + Grafana
  • โœ… 2 new Ansible roles (prometheus, grafana)
  • โœ… 3 new playbooks for monitoring stack
  • โœ… 4+ operational dashboards
  • โœ… Real-time metrics from all 4 servers
  • โœ… Documentation and access guide

Access URLs:

Metrics Collected:

  • CPU usage and load averages
  • Memory utilization
  • Disk space and I/O
  • Network traffic
  • System uptime
  • Running processes

Time Invested: ~6 hours
Status: โœ… 100% Complete
Snapshot: Phase3-Complete-Monitoring-Stack (ready to create)


Phase 4: Centralized Logging โธ๏ธ PENDING

Objective: Implement centralized log management with rsyslog or ELK stack backup1

Tasks Completed:

Step 4.1: Log Server Setup โœ…

  • Created rsyslog_server role for control-node
  • Configured rsyslog to receive logs on port 514 (UDP/TCP)
  • Set up log file organization by hostname
  • Configured firewall to allow syslog traffic from internal network
  • Created log directory structure in /var/log/remote/

Step 4.2: Log Client Configuration โœ…

  • Created rsyslog_client role for managed nodes
  • Configured all managed nodes to forward logs to control-node
  • Set up reliable log forwarding with queue management
  • Tested log forwarding from all 3 servers

Step 4.3: Log Management โœ…

  • Implemented log rotation policies
  • Configured retention: daily logs, weekly archives
  • Set up automatic compression of old logs
  • Created logrotate configuration for remote logs

Step 4.4: Testing and Verification โœ…

  • Created logging.yml playbook for deployment
  • Created verify-logging.yml for validation
  • Tested log forwarding from all managed nodes
  • Verified centralized log collection
  • Confirmed log rotation is working

Deliverables Completed:

  • โœ… Centralized log server on control-node
  • โœ… Log forwarding from all managed nodes
  • โœ… 2 new Ansible roles (rsyslog_server, rsyslog_client)
  • โœ… 2 new playbooks for logging infrastructure
  • โœ… Automated log rotation and retention
  • โœ… Organized log directory structure

Log Structure:

/var/log/remote/
โ”œโ”€โ”€ web-server/
โ”‚   โ””โ”€โ”€ syslog
โ”œโ”€โ”€ app-server/
โ”‚   โ””โ”€โ”€ syslog
โ””โ”€โ”€ db-server/
    โ””โ”€โ”€ syslog

Time Invested: ~2 hours
Status: โœ… 100% Complete


Phase 5: Backup Automation โœ… COMPLETE

Objective: Implement automated backup system for critical data backupcapture

Tasks Completed:

Step 5.1: Backup Strategy Design โœ…

  • Designed multi-tier backup retention strategy
  • Defined backup types: database and configuration
  • Established retention periods:
    • Daily: 7 days
    • Weekly: 28 days
    • Monthly: 90 days

Step 5.2: Database Backup Implementation โœ…

  • Created backup_postgresql role
  • Implemented PostgreSQL backup script with pg_dump
  • Configured compression (gzip) for space efficiency
  • Set up automated retention management
  • Created cron job for daily execution (2:00 AM)
  • Deployed to db-server

Step 5.3: Configuration Backup Implementation โœ…

  • Created backup_configs role
  • Implemented backup script for system configurations:
    • Nginx configurations
    • Application code
    • SSH configurations
    • Firewall rules
    • fail2ban settings
    • rsyslog configurations
    • Ansible infrastructure files
  • Configured compression and retention
  • Created cron job for daily execution (3:00 AM)
  • Deployed to web-server and app-server

Step 5.4: Backup Deployment and Testing โœ…

  • Created backup.yml playbook
  • Fixed YAML syntax errors in backup roles
  • Deployed backup system to all servers
  • Created backup directories on all nodes
  • Verified backup scripts are executable
  • Tested manual backup execution
  • Confirmed cron jobs are scheduled
  • Validated backup files are created with content

Deliverables Completed:

  • โœ… Automated database backups (PostgreSQL on db-server)
  • โœ… Automated configuration backups (web-server, app-server)
  • โœ… 2 new Ansible roles (backup_postgresql, backup_configs)
  • โœ… 1 new playbook for backup deployment
  • โœ… Multi-tier retention policy (7/28/90 days)
  • โœ… Scheduled cron jobs for automation
  • โœ… Backup verification capability

Backup Architecture:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    Backup Strategy                       โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                          โ”‚
โ”‚  db-server (10.0.2.14)                                  โ”‚
โ”‚  โ”œโ”€โ”€ /var/backups/database/                             โ”‚
โ”‚  โ”‚   โ”œโ”€โ”€ daily/    (7 days retention)                   โ”‚
โ”‚  โ”‚   โ”œโ”€โ”€ weekly/   (28 days retention)                  โ”‚
โ”‚  โ”‚   โ””โ”€โ”€ monthly/  (90 days retention)                  โ”‚
โ”‚  โ””โ”€โ”€ Cron: Daily at 2:00 AM                             โ”‚
โ”‚                                                          โ”‚
โ”‚  web-server (10.0.2.12)                                 โ”‚
โ”‚  โ”œโ”€โ”€ /var/backups/configs/                              โ”‚
โ”‚  โ”‚   โ”œโ”€โ”€ daily/    (7 days retention)                   โ”‚
โ”‚  โ”‚   โ”œโ”€โ”€ weekly/   (28 days retention)                  โ”‚
โ”‚  โ”‚   โ””โ”€โ”€ monthly/  (90 days retention)                  โ”‚
โ”‚  โ””โ”€โ”€ Cron: Daily at 3:00 AM                             โ”‚
โ”‚                                                          โ”‚
โ”‚  app-server (10.0.2.13)                                 โ”‚
โ”‚  โ”œโ”€โ”€ /var/backups/configs/                              โ”‚
โ”‚  โ”‚   โ”œโ”€โ”€ daily/    (7 days retention)                   โ”‚
โ”‚  โ”‚   โ”œโ”€โ”€ weekly/   (28 days retention)                  โ”‚
โ”‚  โ”‚   โ””โ”€โ”€ monthly/  (90 days retention)                  โ”‚
โ”‚  โ””โ”€โ”€ Cron: Daily at 3:00 AM                             โ”‚
โ”‚                                                          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Backup Components:

1. Database Backups (db-server):

  • Full PostgreSQL database dump (appdb)
  • Backup location: /var/backups/database/
  • Schedule: Daily at 2:00 AM
  • Manual execution: sudo /usr/local/bin/backup-database.sh

2. Configuration Backups (web-server, app-server):

  • System and application configurations
  • Backup location: /var/backups/configs/
  • Schedule: Daily at 3:00 AM
  • Manual execution: sudo /usr/local/bin/backup-configs.sh

Time Invested: ~2 hours
Status: โœ… 100% Complete


markdown# PHASE 6 UPDATES - COPY THESE SECTIONS INTO YOUR README

Replace "Phase 6: Disaster Recovery โธ๏ธ PENDING" section with:

Phase 6: Disaster Recovery โœ… COMPLETE Objective: Develop and test disaster recovery procedures

Tasks Completed:

Step 6.1: Database Restore Tools โœ… Created restore-database.sh script for PostgreSQL Deployed to db-server (/usr/local/bin/) Interactive confirmation and validation Automatic backup decompression support Created list-db-backups.sh utility

Step 6.2: Configuration Restore Tools โœ… Created restore-configs.sh script for system configs Deployed to web-server and app-server Restores Nginx, SSH, application configurations Automatic service restart after restore Created list-config-backups.sh utility

Step 6.3: Infrastructure Rebuild Automation โœ… Created rebuild-infrastructure.yml master playbook Single command rebuilds entire infrastructure Imports all deployment playbooks in sequence Tested playbook syntax and structure

Step 6.4: Recovery Procedures Documented โœ… Defined Recovery Time Objectives (RTO):

  • Database: 2 hours
  • Web/App Servers: 30 minutes
  • Complete Infrastructure: 4 hours Defined Recovery Point Objectives (RPO): 24 hours (daily backups) Created disaster recovery procedures Documented restore commands and workflows

Deliverables Completed:

โœ… Database restore script (restore-database.sh) โœ… Configuration restore script (restore-configs.sh) โœ… Backup listing utilities (list-db-backups.sh, list-config-backups.sh) โœ… 2 new playbooks (disaster-recovery.yml, rebuild-infrastructure.yml) โœ… RTO/RPO documentation โœ… Recovery procedures documented

Recovery Commands:

List available backups

ssh sysadmin@192.168.56.14 "sudo /usr/local/bin/list-db-backups.sh" ssh sysadmin@192.168.56.12 "sudo /usr/local/bin/list-config-backups.sh"

Restore database (DESTRUCTIVE)

ssh sysadmin@192.168.56.14 sudo /usr/local/bin/restore-database.sh /var/backups/database/daily/appdb_YYYY-MM-DD.sql.gz

Restore configurations (DESTRUCTIVE)

ssh sysadmin@192.168.56.12 sudo /usr/local/bin/restore-configs.sh /var/backups/configs/daily/configs_YYYY-MM-DD.tar.gz

Complete infrastructure rebuild

cd ~/infrastructure ansible-playbook playbooks/rebuild-infrastructure.yml

Time Invested: ~3 hours Status: โœ… 100% Complete

๐Ÿ“Š Current Progress

Overall Project Status

Phase 1: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 100% โœ… COMPLETE
Phase 2: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 100% โœ… COMPLETE
Phase 3: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 100% โœ… COMPLETE
Phase 4: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 100% โœ… COMPLETE
Phase 5: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 100% โœ… COMPLETE
Phase 5: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 100% โœ… COMPLETE
โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
Overall: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ  100% Complete

Phase 2 Achievement Summary

Infrastructure Automated:

  • โœ… 4 VMs fully configured via Ansible
  • โœ… 8 reusable Ansible roles created
  • โœ… 6 functional playbooks developed
  • โœ… Complete 3-tier application stack deployed
  • โœ… Zero manual configuration required
  • โœ… Full idempotency verified

Services Deployed:

  • โœ… Nginx reverse proxy (web-server)
  • โœ… Express.js application (app-server)
  • โœ… PostgreSQL 16 database (db-server)
  • โœ… Security hardening (all servers)
  • โœ… Monitoring agents (all servers)

Testing Results:

  • โœ… All playbooks execute successfully
  • โœ… Idempotency confirmed (safe to re-run)
  • โœ… End-to-end connectivity verified
  • โœ… All services healthy and responsive
  • โœ… Security measures active and tested

Phase 3 Achievement Summary

Monitoring Infrastructure Deployed:

  • โœ… Prometheus 3.9.1 installed and configured on control-node
  • โœ… Grafana 12.3.2 installed and configured on control-node
  • โœ… All 4 servers being monitored (100% coverage)
  • โœ… Real-time metrics collection every 15 seconds
  • โœ… Dashboard visualization with multiple views
  • โœ… Data source integration tested and working

New Ansible Components:

  • โœ… roles/prometheus/ - Complete Prometheus role
  • โœ… roles/grafana/ - Complete Grafana role
  • โœ… playbooks/monitoring.yml - Monitoring stack deployment
  • โœ… playbooks/verify-monitoring.yml - Monitoring validation
  • โœ… playbooks/open-monitoring-ports.yml - Firewall configuration

Dashboards Created:

  • โœ… "System Monitoring Dashboard" - Comprehensive metrics view
  • โœ… "SIMPLE TEST - RAW METRICS" - Debug/verification dashboard
  • โœ… "GUARANTEED WORKING - TABLE VIEW" - Tabular data display
  • โœ… "GUARANTEED WORKING - STAT VIEW" - Stat panel dashboard

Testing Results:

  • โœ… Prometheus scraping all 4 targets (all "UP")
  • โœ… Grafana can query Prometheus successfully
  • โœ… Dashboard panels showing real-time data
  • โœ… All services healthy and responsive
  • โœ… Firewall rules properly configured

Time Investment

  • Phase 1: 4 hours โœ…
  • Phase 2: 11 hours โœ…
  • Phase 3: 6 hours โœ…
  • Phase 4: 2 hours โœ…
  • Phase 5: 6 hours โœ…
  • Total so far: 25 hours
  • Estimated remaining: 15-20 hours

Last Updated

Date: February73, 2026
Current Phase: Phase 4-5 - Complete โœ…


๐Ÿ” Security Implementations

Multi-Layered Security Approach

1. SSH Hardening โœ…

  • โœ… Key-based authentication only (password authentication disabled)
  • โœ… Root login disabled via SSH
  • โœ… Public key authentication configured for sysadmin user
  • โœ… MaxAuthTries: Limited to 3 attempts
  • โœ… Automated via Ansible (ssh_hardening role)

Configuration File: /etc/ssh/sshd_config

2. Firewall (UFW) โœ…

  • โœ… Default Policy: Deny incoming, Allow outgoing
  • โœ… Service-Specific Rules:
    • SSH (22/tcp) - Management access
    • HTTP (80/tcp) - Web server only
    • HTTPS (443/tcp) - Web server only
    • App (3000/tcp) - From web server only
    • PostgreSQL (5432/tcp) - From app server only
    • node_exporter (9100/tcp) - Internal network only
    • Syslog (514/udp, 514/tcp) - Internal network only
  • โœ… Automated via Ansible (firewall role)

Check Status: sudo ufw status verbose

3. Intrusion Prevention (fail2ban) โœ…

  • โœ… Monitoring: SSH login attempts
  • โœ… Max Retries: 3 failed attempts
  • โœ… Ban Time: 3600 seconds (1 hour)
  • โœ… Find Time: 600 seconds (10 minutes)
  • โœ… Automatic IP banning after threshold exceeded
  • โœ… Automated via Ansible (fail2ban role)

Configuration File: /etc/fail2ban/jail.local
Check Status: sudo fail2ban-client status sshd

4. Automatic Security Updates โœ…

  • โœ… Service: unattended-upgrades
  • โœ… Update Type: Security updates only
  • โœ… Auto-reboot: Disabled (manual control)
  • โœ… Old Kernel Cleanup: Enabled
  • โœ… Daily Update Check: Automated
  • โœ… Automated via Ansible (auto_updates role)

Configuration File: /etc/apt/apt.conf.d/50unattended-upgrades

5. System Monitoring โœ…

  • โœ… Agent: Prometheus node_exporter v1.8.2
  • โœ… Metrics Port: 9100
  • โœ… Metrics Collected:
    • CPU usage and load averages
    • Memory and swap utilization
    • Disk space and I/O
    • Network traffic and errors
    • System uptime and processes
  • โœ… Automated via Ansible (node_exporter role)

Access Metrics: curl http://localhost:9100/metrics

6. Network Segmentation โœ…

  • โœ… Web Tier: Internet-facing (ports 80, 443)
  • โœ… App Tier: Only accessible from web server
  • โœ… Database Tier: Only accessible from app server
  • โœ… Management: SSH restricted via firewall rules

๐Ÿ› ๏ธ Technologies Used

Operating Systems & Virtualization

  • Host OS: Windows 11 Pro
  • Hypervisor: Oracle VirtualBox 7.x
  • Guest OS: Linux Mint 22 Wilma (based on Ubuntu 24.04 LTS)
  • Kernel: Linux 6.8.x

Automation & Configuration Management

  • Ansible: 2.16+ (automation platform)
  • YAML: Configuration and playbook syntax
  • Jinja2: Template engine for dynamic configurations
  • Git: Version control
  • GitHub: Remote repository

Security Tools

  • OpenSSH: Secure remote access
  • UFW: Firewall management (frontend for iptables)
  • fail2ban: Intrusion prevention system
  • unattended-upgrades: Automatic security patching

Application Stack

  • Nginx: Reverse proxy and web server
  • Node.js: JavaScript runtime (v18.x)
  • Express.js: Web application framework
  • PostgreSQL: Relational database (v16)

Monitoring & Observability

  • Prometheus node_exporter: Metrics collection agent (v1.8.2)
  • Prometheus: Time-series database and alerting (v3.9.1)
  • Grafana: Visualization and dashboards (v12.3.2)

Logging & Backup

  • rsyslog: Centralized log management
  • logrotate: Log rotation and retention
  • pg_dump: PostgreSQL backup utility
  • cron: Job scheduling for automated backups
  • Bash: Backup and maintenance scripts

๐Ÿš€ Quick Start

Prerequisites

  • Hardware: 16GB RAM minimum, 4+ CPU cores recommended
  • Software:
    • VirtualBox 7.x or later
    • Windows 10/11 (or any OS supporting VirtualBox)
    • SSH client (built into Windows 10+)
    • Git (for version control)

Access the Infrastructure

SSH from Windows (PowerShell)

# Access Ansible control node
ssh sysadmin@192.168.56.11

# Access web server
ssh sysadmin@192.168.56.12

# Access app server
ssh sysadmin@192.168.56.13

# Access database server
ssh sysadmin@192.168.56.14

Run Ansible Playbooks (from control-node)

# SSH into control node
ssh sysadmin@192.168.56.11

# Navigate to infrastructure directory
cd ~/infrastructure

# Test connectivity to all managed nodes
ansible managed_nodes -m ping

# Apply complete infrastructure automation
ansible-playbook playbooks/base-hardening.yml

# Deploy web server (Nginx)
ansible-playbook playbooks/web-server.yml

# Deploy application server (Node.js)
ansible-playbook playbooks/app-server.yml

# Deploy database server (PostgreSQL)
ansible-playbook playbooks/db-server.yml

# Deploy monitoring stack (Prometheus + Grafana)
ansible-playbook playbooks/monitoring.yml

# Deploy centralized logging
ansible-playbook playbooks/logging.yml

# Deploy backup system
ansible-playbook playbooks/backup.yml

# Verify all configurations and services
ansible-playbook playbooks/verify-config.yml
ansible-playbook playbooks/verify-all-services.yml
ansible-playbook playbooks/verify-monitoring.yml
ansible-playbook playbooks/verify-logging.yml

# Run in check mode (dry run - no changes)
ansible-playbook playbooks/base-hardening.yml --check

# Run with verbose output for troubleshooting
ansible-playbook playbooks/base-hardening.yml -vvv

Access Monitoring Dashboards

# From Windows browser:
# Prometheus: http://192.168.56.11:9090
# Grafana:    http://192.168.56.11:3001
#
# Grafana Credentials:
#   Username: admin
#   Password: admin123!

Check Centralized Logs

# From control-node, view centralized logs
# Web server logs
sudo tail -f /var/log/remote/web-server/syslog

# App server logs
sudo tail -f /var/log/remote/app-server/syslog

# Database server logs
sudo tail -f /var/log/remote/db-server/syslog

# View all logs
sudo ls -lh /var/log/remote/*/

Verify Backups

# Check backup directories exist
ansible all -m shell -a "ls -lh /var/backups/" -b

# Check database backup files
ansible db_servers -m shell -a "ls -lh /var/backups/database/daily/" -b

# Check configuration backup files
ansible managed_nodes -m shell -a "ls -lh /var/backups/configs/daily/" -b

# Verify cron jobs are scheduled
ansible all -m shell -a "crontab -l 2>/dev/null | grep backup || echo 'No backup cron jobs'" -b

# Manually trigger a test backup
# Database backup
ansible db_servers -m shell -a "/usr/local/bin/backup-database.sh" -b

# Configuration backup
ansible web_servers -m shell -a "/usr/local/bin/backup-configs.sh" -b

Test the Application Stack

# From Windows, test the web server
curl http://192.168.56.12

# From control-node, test app server health
ansible app_servers -m shell -a "curl -s http://localhost:3000/health"

# Test database connectivity
ansible app_servers -m shell -a 'PGPASSWORD=SecurePassword123\! psql -h 10.0.2.14 -U appuser -d appdb -c "SELECT version();"' -b

# Test end-to-end flow (Web โ†’ App โ†’ Database)
curl http://192.168.56.12/health

Project Structure

infrastructure/
โ”‚
โ”œโ”€โ”€ ansible.cfg                      # Ansible configuration file
โ”œโ”€โ”€ .gitignore                       # Git ignore rules (secrets, keys)
โ”œโ”€โ”€ README.md                        # This comprehensive documentation
โ”‚
โ”œโ”€โ”€ inventory/
โ”‚   โ””โ”€โ”€ hosts.yml                    # Server inventory with host groups
โ”‚
โ”œโ”€โ”€ group_vars/
โ”‚   โ””โ”€โ”€ all.yml                      # Global variables for all hosts
โ”‚
โ”œโ”€โ”€ playbooks/                       # Ansible playbooks (automation scripts)
โ”‚   โ”œโ”€โ”€ base-hardening.yml           # Security hardening for all servers
โ”‚   โ”œโ”€โ”€ web-server.yml               # Nginx reverse proxy deployment
โ”‚   โ”œโ”€โ”€ app-server.yml               # Node.js application deployment
โ”‚   โ”œโ”€โ”€ db-server.yml                # PostgreSQL database deployment
โ”‚   โ”œโ”€โ”€ verify-config.yml            # Individual service verification
โ”‚   โ”œโ”€โ”€ verify-all-services.yml      # End-to-end testing
โ”‚   โ”œโ”€โ”€ monitoring.yml               # Monitoring stack deployment
โ”‚   โ”œโ”€โ”€ verify-monitoring.yml        # Monitoring validation
โ”‚   โ”œโ”€โ”€ open-monitoring-ports.yml    # Firewall for monitoring
โ”‚   โ”œโ”€โ”€ logging.yml                  # ๐Ÿ“‹ Centralized logging deployment
โ”‚   โ”œโ”€โ”€ verify-logging.yml           # ๐Ÿ“‹ Logging validation
โ”‚   โ””โ”€โ”€ backup.yml                   # ๐Ÿ’พ Backup system deployment
โ”‚
โ”œโ”€โ”€ roles/                           # Ansible roles (reusable components)
โ”‚   โ”œโ”€โ”€ ssh_hardening/               # SSH security configuration
โ”‚   โ”œโ”€โ”€ firewall/                    # UFW firewall configuration
โ”‚   โ”œโ”€โ”€ fail2ban/                    # Intrusion prevention
โ”‚   โ”œโ”€โ”€ auto_updates/                # Automatic security updates
โ”‚   โ”œโ”€โ”€ node_exporter/               # Monitoring agent
โ”‚   โ”œโ”€โ”€ nginx/                       # Web server and reverse proxy
โ”‚   โ”œโ”€โ”€ nodejs_app/                  # Node.js application
โ”‚   โ”œโ”€โ”€ postgresql/                  # PostgreSQL database
โ”‚   โ”œโ”€โ”€ prometheus/                  # Prometheus monitoring
โ”‚   โ”œโ”€โ”€ grafana/                     # Grafana visualization
โ”‚   โ”œโ”€โ”€ rsyslog_server/              # ๐Ÿ“‹ Centralized log server
โ”‚   โ”œโ”€โ”€ rsyslog_client/              # ๐Ÿ“‹ Log forwarding client
โ”‚   โ”œโ”€โ”€ backup_postgresql/           # ๐Ÿ’พ Database backup automation
โ”‚   โ””โ”€โ”€ backup_configs/              # ๐Ÿ’พ Configuration backup automation
โ”‚
โ””โ”€โ”€ files/                           # Static files (future use)
    โ””โ”€โ”€ scripts/

๐Ÿ“š Command Reference

Phase 1: Manual Base Configuration Commands

Click to expand Phase 1 commands

Initial Network Configuration (baseline-template)

# Set static IP on NAT Network (enp0s3)
sudo nmcli connection modify "Wired connection 1" \
  ipv4.addresses 10.0.2.10/24 \
  ipv4.gateway 10.0.2.1 \
  ipv4.dns "8.8.8.8,8.8.4.4" \
  ipv4.method manual

# Apply changes
sudo nmcli connection down "Wired connection 1"
sudo nmcli connection up "Wired connection 1"

# Set static IP on Host-Only Network (enp0s8)
sudo nmcli connection add type ethernet ifname enp0s8 con-name host-only \
  ipv4.addresses 192.168.56.10/24 \
  ipv4.method manual

# Activate Host-Only connection
sudo nmcli connection up host-only

# Verify network configuration
ip addr show
ip route show

Hostname Configuration

# Set hostname
sudo hostnamectl set-hostname baseline-template

# Edit /etc/hosts for proper resolution
sudo nano /etc/hosts
# Change 127.0.1.1 line to: 127.0.1.1  baseline-template

# Verify
hostname
hostnamectl

System Updates

# Update package lists
sudo apt update

# Upgrade all packages
sudo apt upgrade -y

# Install essential tools
sudo apt install -y vim curl wget git net-tools ufw fail2ban openssh-server

# Reboot to apply updates
sudo reboot

SSH Configuration

# Install OpenSSH server (usually pre-installed)
sudo apt install -y openssh-server

# Enable and start SSH service
sudo systemctl enable ssh
sudo systemctl start ssh

# Create SSH directory and set permissions
mkdir -p ~/.ssh
chmod 700 ~/.ssh

# Add your public key to authorized_keys (paste your key)
nano ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys

# Backup original SSH configuration
sudo cp /etc/ssh/sshd_config /etc/ssh/sshd_config.backup

# Edit SSH configuration
sudo nano /etc/ssh/sshd_config
# Set these values:
#   PermitRootLogin no
#   PasswordAuthentication no
#   PubkeyAuthentication yes
#   AuthorizedKeysFile .ssh/authorized_keys
#   MaxAuthTries 3

# Test configuration syntax
sudo sshd -t

# Restart SSH service
sudo systemctl restart ssh

# Verify SSH status
sudo systemctl status ssh

Firewall (UFW) Configuration

# Set default policies
sudo ufw default deny incoming
sudo ufw default allow outgoing

# Allow SSH
sudo ufw allow 22/tcp

# Enable firewall
sudo ufw enable

# Check status
sudo ufw status verbose
sudo ufw status numbered

fail2ban Installation and Configuration

# Install fail2ban
sudo apt install -y fail2ban

# Copy default configuration
sudo cp /etc/fail2ban/jail.conf /etc/fail2ban/jail.local

# Edit configuration
sudo nano /etc/fail2ban/jail.local
# Configure [sshd] section:
#   enabled = true
#   port = 22
#   maxretry = 3
#   bantime = 3600
#   findtime = 600

# Enable and start fail2ban
sudo systemctl enable fail2ban
sudo systemctl start fail2ban

# Check status
sudo systemctl status fail2ban
sudo fail2ban-client status
sudo fail2ban-client status sshd

Automatic Security Updates

# Install unattended-upgrades
sudo apt install -y unattended-upgrades apt-listchanges

# Configure unattended-upgrades
sudo dpkg-reconfigure -plow unattended-upgrades

# Edit configuration file
sudo nano /etc/apt/apt.conf.d/50unattended-upgrades
# Ensure security updates are enabled:
#   "${distro_id}:${distro_codename}-security";

# Enable and start service
sudo systemctl enable unattended-upgrades
sudo systemctl start unattended-upgrades

# Check status
sudo systemctl status unattended-upgrades

node_exporter Installation

# Download node_exporter
cd /tmp
wget https://github.com/prometheus/node_exporter/releases/download/v1.8.2/node_exporter-1.8.2.linux-amd64.tar.gz

# Extract
tar xvfz node_exporter-1.8.2.linux-amd64.tar.gz

# Move binary to system path
sudo mv node_exporter-1.8.2.linux-amd64/node_exporter /usr/local/bin/

# Create system user
sudo useradd --no-create-home --shell /bin/false node_exporter

# Set ownership
sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter

# Create systemd service
sudo nano /etc/systemd/system/node_exporter.service
# Paste service configuration

# Reload systemd, enable and start service
sudo systemctl daemon-reload
sudo systemctl enable node_exporter
sudo systemctl start node_exporter

# Verify status
sudo systemctl status node_exporter

# Allow node_exporter through firewall (internal network only)
sudo ufw allow from 10.0.2.0/24 to any port 9100 proto tcp

# Test metrics endpoint
curl http://localhost:9100/metrics | head -20

Phase 2: Ansible Automation Commands

Click to expand Phase 2 commands

Initial Setup on Control-Node

# SSH into control-node from Windows
ssh sysadmin@192.168.56.11

# Update system
sudo apt update && sudo apt upgrade -y

# Install Ansible
sudo apt install -y ansible

# Verify installation
ansible --version

Project Structure Creation

# Create main project directory
mkdir -p ~/infrastructure
cd ~/infrastructure

# Create subdirectories
mkdir -p inventory playbooks roles group_vars host_vars files templates

# Create initial files
touch ansible.cfg
touch inventory/hosts.yml
touch group_vars/all.yml

SSH Key Setup for Ansible

# Generate SSH key pair for Ansible (on control-node)
ssh-keygen -t ed25519 -C "ansible-control"
# Press ENTER for all prompts

# View the public key
cat ~/.ssh/id_ed25519.pub

# Copy SSH key to each managed node
ssh-copy-id sysadmin@web-server
ssh-copy-id sysadmin@app-server
ssh-copy-id sysadmin@db-server

# Test passwordless SSH
ssh sysadmin@web-server "hostname"
ssh sysadmin@app-server "hostname"
ssh sysadmin@db-server "hostname"

Test Ansible Connectivity

# Ping all managed nodes
ansible managed_nodes -m ping

# Check hostname
ansible managed_nodes -m command -a "hostname"

# Check uptime
ansible managed_nodes -m command -a "uptime"

# Test sudo access
ansible managed_nodes -m command -a "sudo whoami"

Deploy Service-Specific Playbooks

# Deploy web server (Nginx)
ansible-playbook playbooks/web-server.yml

# Deploy application server (Node.js)
ansible-playbook playbooks/app-server.yml

# Deploy database server (PostgreSQL)
ansible-playbook playbooks/db-server.yml

# Verify all services
ansible-playbook playbooks/verify-all-services.yml

Phase 3: Monitoring Implementation Commands

Click to expand Phase 3 commands

Deploy Monitoring Stack

# Deploy Prometheus and Grafana
ansible-playbook playbooks/monitoring.yml

# Verify the monitoring stack
ansible-playbook playbooks/verify-monitoring.yml

Check Prometheus Status

# Check Prometheus service
sudo systemctl status prometheus

# Check Prometheus targets
curl http://localhost:9090/api/v1/targets | python3 -m json.tool

# Test Prometheus queries
curl "http://localhost:9090/api/v1/query?query=up" | python3 -m json.tool

Check Grafana Status

# Check Grafana service
sudo systemctl status grafana-server

# Test Grafana API
curl http://localhost:3001/api/health | python3 -m json.tool

Phase 4: Centralized Logging Commands

Click to expand Phase 4 commands

Deploy Logging Infrastructure

# SSH into control-node
ssh sysadmin@192.168.56.11
cd ~/infrastructure

# Deploy centralized logging
ansible-playbook playbooks/logging.yml

# Verify logging setup
ansible-playbook playbooks/verify-logging.yml

Check rsyslog Server (on control-node)

# Check rsyslog service
sudo systemctl status rsyslog

# View rsyslog configuration
sudo cat /etc/rsyslog.d/50-remote.conf

# Check if rsyslog is listening on port 514
sudo netstat -tulpn | grep rsyslog
sudo ss -tulpn | grep 514

# View centralized logs
sudo ls -lh /var/log/remote/

# View logs from specific server
sudo tail -f /var/log/remote/web-server/syslog
sudo tail -f /var/log/remote/app-server/syslog
sudo tail -f /var/log/remote/db-server/syslog

Check rsyslog Client (on managed nodes)

# Check rsyslog service on all nodes
ansible managed_nodes -m systemd -a "name=rsyslog state=started enabled=yes" -b

# View rsyslog client configuration
ansible managed_nodes -m shell -a "cat /etc/rsyslog.d/50-forward.conf" -b

# Test log forwarding
ansible web_servers -m shell -a "logger 'Test log message from web-server'" -b

# Verify test message appeared on control-node
sudo grep "Test log message" /var/log/remote/web-server/syslog

Troubleshoot Logging Issues

# Check firewall allows syslog
ansible control -m shell -a "sudo ufw status | grep 514" -b

# Restart rsyslog on all nodes
ansible all -m systemd -a "name=rsyslog state=restarted" -b

# Check rsyslog errors
ansible all -m shell -a "sudo journalctl -u rsyslog -n 20" -b

# Test connectivity from client to server
ansible managed_nodes -m shell -a "nc -zv 10.0.2.11 514" -b

Phase 5: Backup Automation Commands

Click to expand Phase 5 commands

Deploy Backup System

# SSH into control-node
ssh sysadmin@192.168.56.11
cd ~/infrastructure

# Deploy backup automation
ansible-playbook playbooks/backup.yml

Verify Backup Configuration

# 1. Verify backup directories exist
ansible all -m shell -a "ls -lh /var/backups/" -b

# 2. Check actual backup files were created
ansible db_servers -m shell -a "ls -lh /var/backups/database/daily/" -b
ansible managed_nodes -m shell -a "ls -lh /var/backups/configs/daily/" -b

# 3. Verify cron jobs are scheduled
ansible all -m shell -a "crontab -l 2>/dev/null | grep backup || echo 'No backup cron jobs'" -b

# 4. Check backup file sizes to confirm they have content
ansible db_servers -m shell -a "du -sh /var/backups/database/daily/* 2>/dev/null | head -3" -b

Manual Backup Execution

# Manually trigger database backup
ansible db_servers -m shell -a "/usr/local/bin/backup-database.sh" -b

# Manually trigger configuration backup
ansible web_servers -m shell -a "/usr/local/bin/backup-configs.sh" -b
ansible app_servers -m shell -a "/usr/local/bin/backup-configs.sh" -b

# View backup logs
ansible db_servers -m shell -a "tail -20 /var/backups/logs/backup.log" -b
ansible web_servers -m shell -a "tail -20 /var/backups/logs/backup.log" -b

Check Backup Status

# Database backups on db-server
ssh sysadmin@192.168.56.14
sudo ls -lh /var/backups/database/daily/
sudo ls -lh /var/backups/database/weekly/
sudo ls -lh /var/backups/database/monthly/

# Configuration backups on web-server
ssh sysadmin@192.168.56.12
sudo ls -lh /var/backups/configs/daily/
sudo ls -lh /var/backups/configs/weekly/
sudo ls -lh /var/backups/configs/monthly/

# Check cron schedule
crontab -l | grep backup

Test Backup Restoration (for future Phase 6)

# To restore database backup (example for Phase 6):
# sudo -u postgres pg_restore -d appdb /var/backups/database/daily/backup-YYYYMMDD.sql.gz

# To restore configuration files (example for Phase 6):
# sudo tar -xzf /var/backups/configs/daily/backup-YYYYMMDD.tar.gz -C /

Common Troubleshooting Commands

Click to expand troubleshooting commands

Network Issues

# Check IP addresses
ip addr show

# Check routing table
ip route show

# Test connectivity
ping -c 4 8.8.8.8
ping -c 4 google.com
ping -c 4 web-server

# Check open ports
sudo netstat -tulpn
sudo ss -tulpn

# Test specific port
nc -zv hostname port

SSH Issues

# Check SSH service status
sudo systemctl status ssh

# View SSH logs
sudo journalctl -u ssh -n 50
sudo tail -f /var/log/auth.log

# Test SSH configuration
sudo sshd -t

# Debug SSH connection
ssh -vvv sysadmin@hostname

Firewall Issues

# Check UFW status
sudo ufw status verbose
sudo ufw status numbered

# View UFW logs
sudo tail -f /var/log/ufw.log

Service Management

# Check service status
sudo systemctl status SERVICE_NAME

# Start/Stop/Restart service
sudo systemctl start SERVICE_NAME
sudo systemctl stop SERVICE_NAME
sudo systemctl restart SERVICE_NAME

# Enable/Disable on boot
sudo systemctl enable SERVICE_NAME
sudo systemctl disable SERVICE_NAME

# View service logs
sudo journalctl -u SERVICE_NAME -n 50
sudo journalctl -u SERVICE_NAME -f

๐ŸŽ“ Skills Demonstrated

This project showcases a comprehensive set of skills valued in DevOps, Cloud Engineering, and System Administration roles:

Technical Skills

Linux System Administration:

  • Server installation and configuration
  • Network configuration and troubleshooting
  • User and permission management
  • Service management with systemd
  • Package management (apt)
  • Log analysis and troubleshooting

Security & Hardening:

  • SSH key-based authentication
  • Firewall configuration (UFW/iptables)
  • Intrusion detection and prevention
  • Security patch management
  • Principle of least privilege
  • Network segmentation
  • Security auditing

Infrastructure as Code (IaC):

  • Ansible playbook development
  • Role-based architecture
  • Idempotent configuration
  • Template management (Jinja2)
  • Variable management
  • Inventory organization
  • Multi-tier application deployment

Automation & Scripting:

  • Bash scripting
  • Ansible automation
  • Configuration management
  • Automated deployment
  • Service orchestration
  • Cron job scheduling

Application Deployment:

  • Reverse proxy configuration (Nginx)
  • Application server setup (Node.js/Express)
  • Database deployment (PostgreSQL)
  • Service integration
  • Health check implementation

Monitoring & Observability:

  • Metrics collection (node_exporter)
  • Service monitoring (Prometheus)
  • Performance tracking and visualization (Grafana)
  • Dashboard creation and customization
  • Time-series data analysis
  • Infrastructure observability
  • Real-time monitoring implementation

Logging & Auditing:

  • Centralized log management (rsyslog)
  • Log forwarding configuration
  • Log rotation and retention policies
  • Log analysis and troubleshooting

Backup & Recovery:

  • Automated backup strategies
  • Database backup automation (pg_dump)
  • Configuration backup procedures
  • Multi-tier retention policies
  • Backup verification and testing
  • Disaster recovery planning

Version Control:

  • Git workflow
  • Repository management
  • Commit best practices
  • Documentation maintenance
  • Change tracking

Networking:

  • TCP/IP fundamentals
  • DNS configuration
  • Firewall rules
  • Port management
  • Network troubleshooting
  • Multi-network setup (NAT + Host-Only)

Soft Skills

Documentation:

  • Clear, comprehensive README
  • Command references
  • Architecture diagrams
  • Troubleshooting guides
  • Progressive documentation

Problem Solving:

  • Systematic debugging
  • Root cause analysis
  • Solution documentation
  • Iterative improvement

Project Management:

  • Phase-based approach
  • Progress tracking
  • Time estimation
  • Milestone achievement
  • Scope management

Professional Development:

  • Self-directed learning
  • Following best practices
  • Continuous improvement
  • Knowledge sharing

๐Ÿ”ฎ Future Enhancements

Potential additions to expand the project:

Immediate Next Steps (Phase 6)

  • Database restore procedures
  • Configuration restore testing
  • Full infrastructure rebuild automation
  • RTO/RPO documentation
  • Disaster recovery playbooks

Infrastructure Improvements

  • High Availability (HA) setup with keepalived
  • Load balancing with HAProxy or Nginx
  • Containerization with Docker
  • Container orchestration with Kubernetes (K3s)
  • Service mesh implementation (Istio/Linkerd)

Security Enhancements

  • VPN setup (OpenVPN/WireGuard)
  • Certificate management with Let's Encrypt
  • Web Application Firewall (ModSecurity)
  • Security scanning (Lynis, OpenVAS)
  • Compliance automation (CIS benchmarks)
  • Vulnerability scanning

Monitoring & Alerting

  • APM (Application Performance Monitoring)
  • Distributed tracing (Jaeger)
  • Custom metrics and exporters
  • PagerDuty/Slack integration
  • SLA monitoring
  • Log aggregation (ELK stack)

CI/CD Pipeline

  • Jenkins/GitLab CI setup
  • Automated testing
  • Blue-green deployments
  • Canary releases
  • Rollback procedures
  • Pipeline automation

Cloud Migration

  • Terraform for AWS/Azure/GCP
  • Cloud-native monitoring
  • Auto-scaling groups
  • Managed database services
  • Cloud cost optimization

๐Ÿค Contributing

This is a personal portfolio project, but feedback and suggestions are always welcome!

How to Provide Feedback

  1. Issues: Open an issue on GitHub for bugs or suggestions
  2. Discussions: Start a discussion for questions or ideas
  3. Pull Requests: Not accepting PRs as this is a learning project, but appreciate the interest!

Learning Resources

If you're building a similar project, here are helpful resources:

Ansible:

Linux:

DevOps:

Prometheus & Grafana:

Logging & Backup:


๐Ÿ“„ License

This project is open source and available under the MIT License.

Feel free to use this project as a template or reference for your own learning!


๐Ÿ‘ค Author

Skander Ba


๐Ÿ“ž Contact & Feedback

Questions About This Project?

  • GitHub Issues: For bugs or technical questions
  • GitHub Discussions: For general questions and ideas
  • Email: For collaboration or opportunities

Acknowledgments

Special thanks to:

  • The Ansible community for excellent documentation
  • The Linux community for amazing tools and support
  • The Prometheus and Grafana teams for outstanding monitoring tools
  • The rsyslog and PostgreSQL communities for robust logging and database solutions
  • Everyone who provides feedback and suggestions

๐Ÿ“ˆ Project Stats

  • Started: January 30, 2026
  • Phase 1 Complete: January 31, 2026
  • Phase 2 Complete: February 3, 2026
  • Phase 3 Complete: February 3, 2026
  • Phase 4 Complete: February 7, 2026
  • Phase 5 Complete: February 9, 2026
  • Phase 6 Complete: February 12, 2026
  • Current Status: Phase 6 Complete - Project completed
  • Total Commits: Check GitHub for latest count
  • Lines of Ansible Code: ~2,000+
  • Documentation Pages: 1 (comprehensive README)
  • Services Deployed: 3-tier application stack + Monitoring + Logging + Backups
  • Ansible Roles: 14 (security + services + monitoring + logging + backup)
  • Playbooks: 11 (deployment + verification)

๐Ÿ† Milestones Achieved

  • โœ… 2026-01-30: Project initiated, Phase 1 planning complete
  • โœ… 2026-01-31: Phase 1 complete - All VMs configured manually
  • โœ… 2026-02-02: Ansible installed, SSH keys distributed, passwordless sudo configured
  • โœ… 2026-02-02: Git repository initialized and pushed to GitHub
  • โœ… 2026-02-02: Created all base security hardening roles
  • โœ… 2026-02-02: Base-hardening playbook tested and verified
  • โœ… 2026-02-03: Web server (Nginx) deployed successfully
  • โœ… 2026-02-03: Application server (Node.js) deployed successfully
  • โœ… 2026-02-03: Database server (PostgreSQL) deployed successfully
  • โœ… 2026-02-03: Phase 2 COMPLETE - Full automation verified
  • โœ… 2026-02-03: Prometheus deployed and configured
  • โœ… 2026-02-03: Grafana deployed and configured
  • โœ… 2026-02-03: Monitoring dashboards created and tested
  • โœ… 2026-02-03: Phase 3 COMPLETE - Centralized monitoring operational
  • โœ… 2026-02-07: rsyslog server configured on control-node
  • โœ… 2026-02-07: Log forwarding from all managed nodes working
  • โœ… 2026-02-07: Phase 4 COMPLETE - Centralized logging operational
  • โœ… 2026-02-07: Database backup automation deployed
  • โœ… 2026-02-07: Configuration backup automation deployed
  • โœ… 2026-02-07: Multi-tier retention policy implemented
  • โœ… 2026-02-07: Phase 5 COMPLETE - Automated backups operational
  • โœ… 2026-02-12: Phase 6 COMPLETE - Completed Project

๐Ÿ“Š Infrastructure Health Status

Last Verified: February 7, 2026

Component Status Health Check
Web Server (Nginx) ๐ŸŸข Running โœ“ HTTP responding
App Server (Node.js) ๐ŸŸข Running โœ“ Health endpoint OK
Database (PostgreSQL) ๐ŸŸข Running โœ“ Accepting connections
SSH Security ๐ŸŸข Active โœ“ Key-only auth
Firewall (UFW) ๐ŸŸข Active โœ“ Rules enforced
fail2ban ๐ŸŸข Active โœ“ Monitoring SSH
Auto Updates ๐ŸŸข Configured โœ“ Security patches enabled
Node Exporter ๐ŸŸข Running โœ“ Metrics available (all 4 servers)
Prometheus ๐ŸŸข Running โœ“ All 4 targets UP โœ…
Grafana ๐ŸŸข Running โœ“ Dashboards operational โœ…
Centralized Logging ๐ŸŸข Running โœ“ All nodes forwarding logs โœ…
Database Backups ๐ŸŸข Scheduled โœ“ Cron job active (2:00 AM) โœ…
Config Backups ๐ŸŸข Scheduled โœ“ Cron jobs active (3:00 AM) โœ…
End-to-End Connectivity ๐ŸŸข Verified โœ“ Webโ†’Appโ†’DB working
Full Stack ๐ŸŸข Verified โœ“ Complete observability & backup โœ…

Last Updated: February 12, 2026
README Version: 5.0
Status: Living Document - Updated as project progresses

If you found this project helpful or interesting, please consider giving it a โญ on GitHub!

โฌ† Back to Top

About

Developed automated Linux infrastructure using Ansible, managed via Git version control. Repository demonstrates system administration, security hardening, and Infrastructure as Code practices.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors