Enterprise-grade DevOps infrastructure for the Hankers social network
π Live Platform: hankers.techHankers DevOps is a production-ready infrastructure platform built with Infrastructure as Code (IaC) principles, designed for scalability, reliability, and automation. The project implements modern DevOps practices including container orchestration, CI/CD pipelines, monitoring, and cost-optimized cloud infrastructure.
This codebase demonstrates real-world DevOps engineering: automated deployments, infrastructure versioning, zero-downtime releases, and comprehensive observability.
- π CI/CD Automation β GitHub Actions for build, test, and deploy
- βΈοΈ Kubernetes Orchestration β Azure Kubernetes Service (AKS) for container management
- π¦ GitOps Deployment β ArgoCD for declarative continuous delivery
- π― Auto-scaling β Karpenter for intelligent node provisioning
- π Full Observability β Prometheus, Grafana, Loki for metrics and logs
- π Automated SSL/TLS β cert-manager with Let's Encrypt
- ποΈ Infrastructure as Code β Terraform for reproducible infrastructure
- π³ Container Registry β GitHub Container Registry (GHCR) + Docker Hub
- πΎ Managed Database β PostgreSQL on Azure with automated backups
- ποΈ Caching Layer β Redis cluster for session management
- π¦ Object Storage β Azure Blob Storage with CloudFront CDN
- π Security First β Network policies, RBAC, encrypted storage
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Internet Traffic β
ββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββ
β Load Balancer β
βββββββββ¬ββββββββ
β
βΌ
ββββββββββββββββββββββββββ
β Nginx Ingress β
β (SSL Termination) β
ββββββββββ¬ββββββββββββββββ
β
βββββββββββββββΌββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββ βββββββββββ βββββββββββ
βFrontend β β Backend β β AI β
β Pods β β Pods β β Bot β
ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ
β β β
β βΌ β
β ββββββββββββββββ β
β β PostgreSQL β β
β β (Azure) β β
β ββββββββββββββββ β
β β
ββββββββββββ¬ββββββββββββββββ
β
βΌ
βββββββββββββββ
β Redis β
β Cluster β
βββββββββββββββ
Developer Push β GitHub β CI/CD Pipeline β Docker Build β Push to Registry
β
βΌ
Update Helm Charts
β
βΌ
ArgoCD Detects Change
β
βΌ
Deploy to Kubernetes (30s)
β
βΌ
Discord Notification
- Cloud Provider: Microsoft Azure
- Container Orchestration: Azure Kubernetes Service (AKS)
- Infrastructure as Code: Terraform
- Package Management: Helm
- GitOps: ArgoCD
- CI/CD Platform: GitHub Actions
- Container Registry: GitHub Container Registry (GHCR), Docker Hub
- Configuration Management: Ansible
- Build Tools: Docker, multi-stage builds
- Metrics: Prometheus
- Visualization: Grafana
- Log Aggregation: Loki
- Log Shipping: Promtail
- Alerting: AlertManager
- Ingress Controller: Nginx Ingress
- Certificate Management: cert-manager
- SSL Provider: Let's Encrypt
- DNS Management: Azure DNS / Route53
- Primary Database: PostgreSQL (Azure Database)
- Cache/Sessions: Redis (Kubernetes-managed)
- Object Storage: Azure Blob Storage
- CDN: CloudFront
- Node Auto-scaling: Karpenter
- Pod Auto-scaling: Horizontal Pod Autoscaler (HPA)
- Spot Instances: Enabled for 50%+ cost savings
- Location: CI/CD pipeline (
runner/.github/workflows/cicd-pipeline.yml) - Implementation: Universal pipeline dynamically selects build scripts based on repository
- Benefit: Single workflow handles backend, frontend, and cross-platform projects
- Location: Deployment scripts (
devops/*/deploy.sh) - Implementation: Different Helm values files (dev/prod) selected at runtime
- Benefit: Environment-specific configurations without code duplication
- Location: Helm charts (
*/helm-chart/templates/) - Implementation: Base Kubernetes manifests with variable substitution
- Benefit: DRY principle, reusable templates across services
- Location: Terraform modules (
infra/terraform-prod/modules/) - Implementation: Composable infrastructure blocks (VPC, EKS, RDS)
- Benefit: Testable, reusable infrastructure components
- Location: Docker multi-stage builds
- Implementation: Layered approach (base β build β runtime)
- Benefit: Smaller images, better caching, production security
- Location: Monitoring stack
- Implementation: Prometheus scrapes metrics, AlertManager triggers notifications
- Benefit: Decoupled monitoring, scalable alerting
- Location: Nginx Ingress Controller
- Implementation: Reverse proxy routing external traffic to internal services
- Benefit: Centralized SSL, load balancing, single entry point
- Location: Multi-repo structure
- Implementation: Separate repos for backend, frontend, cross-platform, devops
- Benefit: Independent versioning, team autonomy, clear boundaries
- Azure subscription (or AWS for alternative setup)
kubectl,helm,terraforminstalled locally- GitHub account with Personal Access Token (PAT)
- Docker installed for local testing
# 1. Clone the repository
git clone https://github.com/SWEProject25/devops.git
cd devops
# 2. Configure Terraform variables
cd infra/terraform
cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars with your values
# 3. Provision infrastructure
terraform init
terraform plan
terraform apply
# 4. Configure kubectl
aws eks update-kubeconfig --region us-east-1 --name hankers-eks
# 5. Install Kubernetes components using Ansible
cd ../setup/ansible
ansible-playbook playbook.yml
# 6. Verify deployment
kubectl get pods -A# Deploy a specific service manually
cd backend # or frontend, ml-model-service
bash deploy.shdevops/
βββ backend/ # Backend service deployment
β βββ helm-chart/ # Kubernetes manifests
β βββ build.sh # Docker build script
β βββ deploy.sh # Helm deployment script
β βββ sonar-project.properties
βββ frontend/ # Frontend service deployment
β βββ helm-chart/
β βββ build.sh
β βββ deploy.sh
β βββ sonar-project.properties
βββ cross-platform/ # Flutter mobile app builds
β βββ build.sh
β βββ sonar-project.properties
βββ ml-model-service/ # AI service deployment
β βββ build.sh
β βββ deploy.sh
βββ ai-bot/ # AI bot service
β βββ build.sh
βββ infra/
β βββ terraform/ # Infrastructure as Code
β β βββ main.tf
β β βββ modules/ # Reusable Terraform modules
β β β βββ vpc/
β β β βββ eks/
β β β βββ rds/
β β β βββ s3/
β β β βββ karpenter/
β β β βββ cloudfront/
β β βββ outputs.tf
β β βββ variables.tf
β βββ setup/
β βββ ansible/ # Kubernetes setup automation
β β βββ playbook.yml
β β βββ roles/ # Ansible roles for each component
β β β βββ helm_repos/
β β β βββ metrics-server/
β β β βββ ingress/
β β β βββ cert_manager/
β β β βββ redis/
β β β βββ monitoring/
β β β βββ logging/
β β β βββ karpenter/
β β β βββ argocd/
β β βββ inventory/
β βββ config/ # Configuration files
β βββ argocd/
β βββ ingress/
β βββ karpenter/
β βββ monitoring/
β βββ logging/
βββ README.md
Each service requires specific environment variables. These are managed through:
- GitHub Secrets β Sensitive credentials (API keys, passwords)
- Helm Values β Service configurations (
values.yaml) - ConfigMaps β Non-sensitive application configs
- Secrets β Encrypted sensitive data in Kubernetes
terraform.tfvarsβ Infrastructure parametersvalues.yamlβ Helm chart configurationssonar-project.propertiesβ Code quality settingsansible.cfgβ Automation configuration
- Grafana: https://grafana.hankers.tech (admin/admin)
- Prometheus: https://prometheus.hankers.tech
- ArgoCD: https://argocd.hankers.tech
- Kubernetes Cluster Overview β CPU, memory, pods, nodes
- Node Exporter β Detailed node metrics
- Application Metrics β Custom service metrics
- Log Exploration β Loki integration for log analysis
AlertManager configured for:
- High CPU/memory usage
- Pod crash loops
- Certificate expiration warnings
- Database connection issues
- β Network Policies β Pod-to-pod traffic control
- β RBAC β Role-based access control for Kubernetes
- β Encrypted Storage β All persistent volumes encrypted
- β SSL/TLS β Automatic certificate management
- β Secret Management β Kubernetes secrets for credentials
- β Security Scanning β Container image vulnerability scanning
- β Pod Security Standards β Enforced security policies
- Karpenter Auto-scaling β Intelligent node provisioning
- Spot Instances β 50-70% cost reduction for non-critical workloads
- Right-sizing β Automated resource optimization
- No NAT Gateway β Saved $45/month by using public subnets
- ARM Instances β 20% cheaper than x86 (t4g vs t3)
- Storage Optimization β gp3 volumes instead of gp2
- Reduced Backup Retention β 1 day instead of 7 for non-prod
- Kubernetes Cluster: ~$70-100
- Database (RDS): ~$15-30
- Load Balancer: ~$20
- Storage: ~$10
- Monitoring: ~$5
- Total: ~$120-165/month
- Code Checkout β Pull latest code from repository
- Run Tests β Unit tests, integration tests, coverage
- Build Docker Image β Multi-stage builds for optimization
- Push to Registry β GHCR and Docker Hub
- Update Helm Charts β Automated version bump
- ArgoCD Sync β Automatic deployment (30 seconds)
- Notify Team β Discord webhook notification
- Build Time: ~3-5 minutes
- Deployment Time: ~30 seconds (ArgoCD)
- Total Time to Production: ~5 minutes
- Terraform Modules Documentation
- Ansible Roles Documentation
- Helm Chart Templates
- Monitoring Configuration
Karim Farid |
This project is for educational and portfolio purposes. All tools used are open-source with commercial-friendly licenses:
- Kubernetes, Docker, Helm, Terraform: Apache 2.0 License β
- Prometheus: Apache 2.0 License β
- Grafana, Loki: AGPL v3 (acceptable for internal use) β
- GitHub Actions: Free tier for private repos β
This is a production infrastructure repository. For collaboration or questions:
- Open an issue for discussion
- Submit pull requests with detailed descriptions
- Follow infrastructure best practices
- Test changes in a separate environment first
For infrastructure issues or questions:
- Email: devops@hankers.tech
- Discord: Join our server
Special thanks to the open-source community for the amazing tools that make this infrastructure possible:
- Kubernetes community
- HashiCorp (Terraform)
- Helm maintainers
- Prometheus & Grafana teams
- cert-manager contributors
- Karpenter developers
β If you find this DevOps setup useful, consider starring the repository!