diff --git a/README.md b/README.md index 5f929a6..c8800b9 100644 --- a/README.md +++ b/README.md @@ -56,7 +56,7 @@ Explore the documentation to understand how **BrainBytes** is architected, deplo ## Important Files -These files define and support the core automation and infrastructure setup of the project. +These files support the core automation, infrastructure setup, and presentation delivery of the BrainBytes platform. - **[GitHub Actions Workflow file (`automation.yml`)](.github/workflows/automation.yml)** Main GitHub Actions workflow responsible for CI/CD, including linting, testing, Docker builds, and remote deployment. @@ -73,6 +73,12 @@ These files define and support the core automation and infrastructure setup of t - **[Monitoring System Demonstration Script](./docs/monitoring-demo-script.md)** A step-by-step guide for delivering a 10–15 minute live demo of BrainBytes’ monitoring capabilities. +- **[Presentation Outline](docs/presentation-outline.md)** + Presentation that outlines BrainBytes’ architecture, DevOps, and monitoring strategy. + +- **[Live Demonstration Plan](docs/demo-plan.md)** + A minute-by-minute walkthrough for presenting the platform’s deployment, CI/CD, and monitoring in a live setting. + - **Dashboard JSON Exports** - [main-dashboard.json](./docker/dashboards/main-dashboard.json) - [resource-dashboard.json](./docker/dashboards/resource-optimization.json) diff --git a/docs/demo-plan.md b/docs/demo-plan.md new file mode 100644 index 0000000..bf6bdd7 --- /dev/null +++ b/docs/demo-plan.md @@ -0,0 +1,194 @@ +# BrainBytes AI Tutoring Platform: Live Demonstration Plan + +--- + +## 1. Objective + +Showcase the BrainBytes AI Tutoring Platform's robust deployment, automated CI/CD, and comprehensive monitoring capabilities, demonstrating a production-ready system. + +--- + +## 2. Target Audience + +- Stakeholders +- Developers +- Operations Team + +--- + +## 3. Time Allotment + +**30–45 minutes**, matching the Presentation Outline structure: + +| Section | Time Allotted | +|----------------------------------------|-------------------| +| Introduction & Project Overview | 2–3 minutes | +| System Architecture Explanation | 3–4 minutes | +| DevOps Implementation Demonstration | 7–8 minutes | +| Operations Capabilities Showcase | 3–4 minutes | +| Conclusion & Q&A | 1–2 minutes | + +--- + +## 4. Technical Requirements for Demonstration + +### Internet Connection +- Stable and high-speed internet access + +### Projector or Screen Sharing +- For displaying the presentation and live demos + +### Web Browser +- Chrome or Firefox recommended +- URLs to access: + - **BrainBytes Frontend**: `https://brainbytes.mcoube.uk` + - **Grafana Dashboard**: `GRAFANA_URL_FQDN` + - **GitHub Actions Workflows** (optional) + +### Terminal Access (Optional but Recommended) +- Local development environment with Git and Docker Compose installed +- SSH access to OVHCloud VPS to run: + - `docker ps` + - `docker logs` + +### Pre-configured Environment +- BrainBytes app (frontend, backend, database) running on OVHCloud VPS +- Prometheus + Grafana stack active and collecting metrics +- Traffic simulation script (`simulation/index.js`) ready for use + +--- + +## 5. Step-by-Step Live Demonstration Plan + +### Introduction & Project Overview (2-3 minutes) + +- **Presenter**: Introduce the project +- Reference the “Introduction and Project Overview” section of the Presentation Outline +- Emphasize: + - Project vision + - Key characteristics + - Milestone objectives achieved + - Team roles + +--- + +### System Architecture Explanation (3–4 minutes) + +- Walk through the “System Architecture Explanation” section +- **Visual Aids**: + - `images/cloud-platform-architecture.png` + - `images/cloud-platform-w-monitoring-architecture.png` +- Discuss: + - Containerized services + - VPS specs + - Security layers: + - UFW + - Fail2ban + - SSH hardening + - Traefik config + - Secrets management + +--- + +### DevOps Implementation Demonstration (7-8 minutes) + +- Cover the “DevOps Implementation Demonstration” section +- **Visual Aids**: + - `images/pipeline-diagram.png` + - `images/gha-to-vps.png` + +#### Option 1: Simulate a Push +1. Show a minor change in frontend/backend (e.g., comment) +2. Commit + push to `main` +3. Open GitHub Actions → show triggered workflow (`automation.yml`) +4. Explain: + - Linting + - Testing + - Docker build & push (backend & frontend) +5. While workflow runs: + - Explain Watchtower deployment process +6. SSH into VPS: + - Show `docker ps` output + +#### Option 2: Show Recent Workflow +- Open recent successful run of `automation.yml` +- Explain: + - GHCR image push + - Watchtower update process +- Confirm via `docker ps` + +--- + +### Operations Capabilities Showcase (3-4 minutes) + +- Cover “Operations Capabilities Showcase” section +- Live Demo of Grafana Dashboards: + - Open **Grafana URL** + +#### System Stats Dashboard (`resource-dashboard.png`) +- CPU, memory, disk, and network metrics +- Show healthy system state + +#### DevOps Dashboard (`main-dashboard-2.png`) +- Most requested services +- Entry point connections +- HTTP status codes (2xx vs others) + +#### Traefik Dashboard (`error-dashboard-2.png`) +- HTTP request/response breakdown +- No 5xx errors = good +- Request and response sizes + +#### Alerting System +- Show Grafana Alerts or Alerting Dashboard +- Visual: `alert-rules-page.png` +- Explain alert routing (e.g., Discord notifications via Alertmanager) + +#### Live Traffic Simulation (Optional) +- Run `simulation/index.js` +- Watch dashboard metrics spike +- Optionally tail logs: + - `docker + +--- + +### Conclusion & Q&A (1–2 minutes) + +- Recap achievements: + - Scalable, secure platform + - Effective DevOps processes + - Real-time monitoring + alerting +- Review lessons learned: + - Importance of CI/CD + - Monitoring & observability + - Security best practices +- Open the floor for Q&A + +--- + +## 6. Backup Plans in Case of Technical Issues + +### Internet Connectivity Loss +- Use static slides/screenshots of: + - Grafana dashboards + - GitHub Actions logs and runs + +### Live Demo Failure (CI/CD or Monitoring) +- Fall back to pre-recorded video walkthroughs or high-res images +- Narrate each phase of the CI/CD and monitoring process + +### VPS Unresponsive / App Down +- Explain documented recovery procedures and system resilience +- Emphasize monitoring’s ability to detect and report downtime + +### Simulation Script Fails +- Explain its purpose +- Point to existing dashboard metrics to illustrate traffic data + +### General Troubleshooting +- Keep: + - `system-design-documentation.md` + - `monitoring-documentation.md` +- Be ready to explain: + - `.env` file handling + - Secrets management strategy \ No newline at end of file diff --git a/docs/presentation-outline.md b/docs/presentation-outline.md new file mode 100644 index 0000000..fecf798 --- /dev/null +++ b/docs/presentation-outline.md @@ -0,0 +1,198 @@ +# BrainBytes AI Tutoring Platform: Presentation Outline + +--- + +## 1. Introduction and Project Overview (2-3 minutes) + +### Project Vision +- BrainBytes is an AI-powered tutoring platform for Filipino students. +- Goal: Deliver accessible academic assistance via a robust, scalable solution. + +### Key Characteristics +- Containerized Architecture (Docker microservices) +- Automated CI/CD (GitHub Actions) +- Cloud Deployment (OVHCloud VPS) +- Security-First Approach (Firewall, intrusion prevention, TLS) + +### Milestone 2 Objectives Achieved +- Containerized application with Docker networking +- Automated CI/CD pipeline +- Deployment to OVHCloud VPS +- Security hardening for production +- Monitoring and observability foundations established + +### Team and Responsibilities +- Brief introductions to team roles: + - Team Lead + - Backend Developer + - Frontend Developer + - DevOps Engineer + +--- + +## 2. System Architecture Explanation (3-4 minutes) + +### Cloud Platform Architecture +- Overview of OVHCloud VPS hosting all services +- Key components: + - Docker Runtime + - socket-proxy + - mongo + - reverse-proxy (Traefik) + - adonisjs (backend) + - nextjs (frontend) + - watchtower + +- Diagrams: + - `images/cloud-platform-architecture.png` + - `images/cloud-platform-w-monitoring-architecture.png` + +### Containerized Application Services +- Explanation of each Docker Compose service +- Benefits of containerization: isolation, portability, scalability + +### Resource Configuration +- VPS Specs: 2 vCPU, 2GB RAM, 40GB SSD, Ubuntu LTS + +### Networking and Security Setup +- Multi-layered security approach via Ansible playbook + +**Firewall (UFW):** +- Default-deny policy +- Allowed ports: SSH, HTTP/S + +**Intrusion Detection/Prevention (Fail2ban):** +- SSH brute-force protection + +**SSH Security Hardening:** +- Key-based authentication +- Root login prohibited + +**Application Security (Traefik):** +- TLS enforcement +- Strong cipher suites +- Security headers: + - HSTS + - X-XSS-Protection + - X-Content-Type-Options + - X-Frame-Options + +**Secrets Management:** +- GitHub Actions Secrets (encrypted) +- `.env` file hardening on VPS +- No secrets in version control + +--- + +## 3. DevOps Implementation Demonstration (7-8 minutes) + +### CI/CD Pipeline Architecture +- GitHub Actions workflow: `automation.yml` +- Diagram: `images/pipeline-diagram.png` +- Triggers: + - `push` (main/develop) + - `pull_request` + - `workflow_dispatch` + +### Key Stages +- **Linting**: Code quality checks using ESLint, Prettier +- **Testing**: + - Backend: Japa (unit, integration) + - Frontend: Cypress (component/UI) +- **Build/Push**: + - Multi-stage Docker builds + - Push to GitHub Container Registry (GHCR) + +### Integration with Containerized Application +- GitHub Actions pushes images to GHCR +- Watchtower automatically pulls and updates containers +- Diagram: `images/gha-to-vps.png` + +### Environment Variable Management & Secrets Handling +- Development: `.env` +- CI/CD: GitHub Actions Secrets +- Production: Manual transfer to VPS with `chmod 600` +- Critical secrets: + - `APP_KEY` + - `GEMINI_KEY` + - `MONGO_ATLAS_URI` + +### Artifact Management +- Docker images as versioned artifacts in GHCR + +### Deployment Process Flow +1. VPS provisioning (Ansible) +2. Image build and push (GitHub Actions) +3. Application update (Watchtower) +4. Traffic management (Traefik) +- Diagram: `images/deployment-process-flow.png` + +--- + +## 4. Operations Capabilities Showcase (3-4 minutes) + +### Monitoring and Observability + +**Architecture Components:** +- Prometheus: metrics scraper and time series DB +- Grafana: dashboard and visualization +- Node Exporter: system metrics +- AdonisJS App: custom metrics +- Traefik: HTTP request metrics +- Alertmanager: alert routing +- alertmanager-discord-relay: Discord integration + +**Diagrams:** +- `images/monitoring-architecture.png` +- `images/cloud-platform-w-monitoring-architecture.png` + +### Dashboard Walkthrough (Live Demo Recommended) +- **System Stats Dashboard** (`resource-dashboard.png`): CPU, Memory, Disk, Network usage +- **DevOps Dashboard** (`main-dashboard-2.png`): Traefik services, request stats, status codes +- **Traefik Error Dashboard** (`error-dashboard-2.png`): HTTP error tracking, request/response size insights + +### Alerting System +- Defined alerts: + - NodeDown + - HighMemoryUsage + - HighCPUUsage + - LowDiskSpace + - HighTraefikErrorRate + - AppEndpointDown +- Show rules in: `alert-rules-page.png` +- Alert routing explained (e.g., grouped and sent to Discord) +- Outline response procedures for critical alerts + +### Troubleshooting Procedures +- Reference troubleshooting matrix: + - Pipeline failures + - Container boot issues + - API errors + - Unresponsive services (Traefik, app) + +### Maintenance Tasks +- Routine updates: + - OS updates (via package manager) + - Docker engine + - Containers (Watchtower handles automation) + +--- + +## 5. Conclusion and Lessons Learned (1-2 minutes) + +### Summary of Achievements +- Deployed a secure, scalable AI tutoring platform +- Integrated DevOps best practices +- Ensured observability, security, and maintainability + +### Key Takeaways +- Containerization ensures consistency across environments +- CI/CD improves development speed and reliability +- Monitoring helps catch issues early +- Security must be layered and proactive + +### Future Enhancements (Optional) +- Mention planned features, scalability options, or AI-related upgrades + +### Q&A +- Open floor for discussion and questions