A comprehensive demonstration platform showcasing Azure SRE Agent capabilities with intentional chaos scenarios, complete monitoring, and automated incident response.
- Overview
- Architecture
- Tech Stack
- Prerequisites
- Quick Start
- E2E Testing
- Copilot Agents
- Chaos Scenarios
- Monitoring & Alerts
- CI/CD Pipeline
- Contributing
This repository demonstrates:
- β Full-stack Todo application (React + Node.js + TypeScript)
- β Infrastructure as Code with Terraform
- β Comprehensive Azure monitoring and alerting
- β GitHub Actions CI/CD pipelines
- β Intentional bugs and chaos scenarios for SRE Agent demonstration
- β Automated issue creation and resolution workflows
- β End-to-end testing with Playwright and AI-powered test agents
- β Copilot custom agents for automated test generation and execution
βββββββββββββββββββ
β React Frontend β
β (Static Web) β
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ ββββββββββββββββ
β Express API βββββββΊβ PostgreSQL β
β (App Service) β β (Flexible) β
ββββββββββ¬βββββββββ ββββββββββββββββ
β
βΌ
βββββββββββββββββββ ββββββββββββββββ
β Redis Cache β β App Insightsβ
β β β + Monitoringβ
βββββββββββββββββββ ββββββββββββββββ
- React 18 with TypeScript
- Vite for build tooling
- Tailwind CSS for styling
- React Query (TanStack Query) for data fetching
- Axios for HTTP client
- Azure Static Web Apps for hosting
- Node.js 20 with Express
- TypeScript for type safety
- Prisma ORM for database access
- Redis for caching
- Winston for logging
- Azure App Service / Container Apps
- Azure Database for PostgreSQL Flexible Server
- Azure Cache for Redis
- Azure Application Insights
- Azure Key Vault
- Azure Monitor & Log Analytics
- Terraform for IaC
- Playwright for end-to-end browser testing
- Playwright MCP for visual exploration and test generation
- Page Object Model pattern for maintainable tests
- API mocking with
page.route()for deterministic tests
- GitHub Actions for CI/CD
- Docker for containerization
- Azure CLI for deployment
- Terraform Cloud (optional)
- Node.js 20+ and npm/yarn
- Docker and Docker Compose
- Terraform 1.5+
- Azure CLI 2.50+
- Azure Subscription with contributor access
- GitHub Account with Actions enabled
git clone https://github.com/your-org/sre-demo.git
cd sre-demo# Install dependencies for both frontend and backend
cd frontend && npm install
cd ../backend && npm install
# Start local infrastructure (PostgreSQL + Redis)
docker-compose up -d
# Run database migrations
cd backend
npm run prisma:migrate
# Start backend (terminal 1)
npm run dev
# Start frontend (terminal 2)
cd ../frontend
npm run devFrontend: http://localhost:5173
Backend: http://localhost:3000
# Login to Azure
az login
# Initialize Terraform
cd terraform/environments/dev
terraform init
# Plan infrastructure
terraform plan -out=tfplan
# Apply infrastructure
terraform apply tfplanRequired secrets for GitHub Actions:
AZURE_CREDENTIALSAZURE_SUBSCRIPTION_IDAZURE_RESOURCE_GROUPDATABASE_URLREDIS_CONNECTION_STRING
This project includes a comprehensive end-to-end testing infrastructure powered by Playwright and Copilot custom agents.
# Install Playwright (one-time setup)
cd e2e && npm install && npx playwright install chromium
# Run all E2E tests
npm run test:e2e
# Run tests with UI mode (interactive)
npm run test:e2e:ui
# Run tests in headed browser
npm run test:e2e:headede2e/
playwright.config.ts # Playwright configuration (chromium + mobile)
pages/ # Page Object Models
tests/ # Test spec files
fixtures/ # Shared mock data and setup
- API Mocking: All tests use
page.route()to mock API calls β no backend required - Page Object Model: Each page has a dedicated POM class for maintainability
- Accessible Selectors: Tests use
getByRole(),getByText(),getByPlaceholder()over CSS selectors - Multi-viewport: Tests run on both desktop Chrome and mobile (iPhone 13) viewports
- Deterministic: Mocked API responses ensure consistent, reliable test results
| Page | Tests | Coverage |
|---|---|---|
| Navigation & Layout | Menu, routing, mobile menu, redirects | Core |
| Dashboard | Stats cards, links, priority breakdown | Core |
| Todos | CRUD via modal, search, filters, toggle | Full |
| Projects | List, create, detail view, members | Full |
| Users | List, search, filter, detail view | Full |
See E2E Testing Instructions for conventions and best practices.
This project includes specialized GitHub Copilot custom agents for automated test generation and infrastructure management.
| Agent | Description | Usage |
|---|---|---|
| Playwright E2E Tester | Orchestrates the full test pipeline: explore β plan β implement β run | @playwright-tester "Generate tests for Todos page" |
| Playwright Explorer | Navigates the app via Playwright MCP, documents elements and flows | @playwright-explorer "Explore the Dashboard" |
| Playwright Test Planner | Creates phased test plans from exploration findings | @playwright-planner "Create test plan" |
| Playwright Test Implementer | Writes Page Objects and test specs, runs and fixes until passing | @playwright-implementer "Implement Phase 3" |
| Agent | Description |
|---|---|
| Azure Infrastructure Expert | Azure IaC, monitoring, troubleshooting, and optimization guidance |
User invokes @playwright-tester
β
βββΊ @playwright-explorer β Navigates site, documents in .testagent/exploration.md
β
βββΊ @playwright-planner β Reads exploration + source, generates .testagent/e2e-plan.md
β
βββΊ @playwright-implementer (per phase)
ββ Creates e2e/pages/*.page.ts (Page Objects)
ββ Creates e2e/tests/*.spec.ts (with API mocks)
ββ Runs npx playwright test
ββ Fixes failures (up to 3 retries)
Agent files are located in .github/agents/.
This repository includes intentional bugs and performance issues for demonstration purposes. See docs/CHAOS_SCENARIOS.md for detailed trigger instructions.
| Scenario | Type | Severity | Trigger |
|---|---|---|---|
| Memory Leak | Performance | High | POST /api/chaos/memory-leak |
| N+1 Query Problem | Performance | Medium | GET /api/todos?inefficient=true |
| Missing Index | Performance | Medium | GET /api/todos/search?q=term |
| Connection Pool Exhaustion | Availability | Critical | POST /api/chaos/exhaust-pool |
| Unhandled Promise Rejection | Reliability | High | POST /api/chaos/unhandled-promise |
| CPU Intensive Loop | Performance | Critical | POST /api/chaos/cpu-spike |
| Database Timeout | Availability | High | POST /api/chaos/db-timeout |
| Cache Invalidation Bug | Data Integrity | Medium | PUT /api/todos/:id?skipCache=true |
| Missing Error Handling | Reliability | Medium | POST /api/todos (malformed data) |
| Infrastructure Drift | Configuration | Low | Manual Terraform changes |
β οΈ High CPU Usage (>80% for 5 minutes)β οΈ High Memory Usage (>85% for 5 minutes)β οΈ Error Rate Spike (>5% of requests)β οΈ Response Time Degradation (>2s average)β οΈ Database Connection Issuesβ οΈ Failed Deploymentsβ οΈ Infrastructure Drift Detected
- Application Performance (Application Insights)
- Infrastructure Health (Azure Monitor)
- Database Performance (PostgreSQL Insights)
- Cache Metrics (Redis Insights)
-
Frontend Deployment (
.github/workflows/frontend-deploy.yml)- Build React application
- Run tests and linting
- Deploy to Azure Static Web Apps
-
Backend Deployment (
.github/workflows/backend-deploy.yml)- Build Docker image
- Run tests and linting
- Push to Azure Container Registry
- Deploy to Azure App Service
-
Infrastructure Deployment (
.github/workflows/infrastructure-deploy.yml)- Terraform plan
- Manual approval for production
- Terraform apply
- Drift detection
- π GitHub Setup Guide - START HERE - Configure secrets and CI/CD
- ποΈ Architecture Details
- π₯ Chaos Scenarios Guide
- π Deployment Guide
- π§ͺ E2E Testing Instructions - Testing conventions and patterns
- π€ Copilot Agents - Custom agents for test generation and infrastructure
This is a demo repository. For suggestions or improvements:
- Fork the repository
- Create a feature branch
- Submit a pull request
MIT License - see LICENSE file for details
Built with β€οΈ for Azure SRE Agent demonstrations