System Architecture

Overview

This document describes the architecture of the Todo App, a full-stack application with Azure infrastructure, monitoring, and AI-powered end-to-end testing with GitHub Copilot custom agents.

Architecture Diagram

┌─────────────────────────────────────────────────────────────────┐
│                        GitHub Actions                            │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────────────┐  │
│  │   Frontend   │  │   Backend    │  │   Infrastructure     │  │
│  │    Deploy    │  │    Deploy    │  │   Deploy + Drift     │  │
│  └──────────────┘  └──────────────┘  └──────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                       Azure Cloud Platform                       │
│                                                                   │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │              Azure Static Web Apps (Frontend)            │  │
│  │                React + Vite + TypeScript                  │  │
│  └──────────────────────────────────────────────────────────┘  │
│                              │                                   │
│                              ▼ HTTPS                             │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │           Azure App Service (Backend API)                │  │
│  │      Node.js + Express + TypeScript + Prisma             │  │
│  │                                                            │  │
│  └──────────────────────────────────────────────────────────┘  │
│           │                    │                    │           │
│           ▼                    ▼                    ▼           │
│  ┌────────────────┐  ┌─────────────────┐  ┌─────────────────┐ │
│  │   PostgreSQL   │  │  Azure Redis    │  │  Application    │ │
│  │ Flexible Server│  │     Cache       │  │    Insights     │ │
│  │                │  │                 │  │                 │ │
│  │  - Todo Data   │  │  - Session Cache│  │  - Telemetry    │ │
│  │  - Metadata    │  │  - Todo Cache   │  │  - Logs         │ │
│  │  - Tags        │  │  - Rate Limit   │  │  - Metrics      │ │
│  └────────────────┘  └─────────────────┘  │  - Traces       │ │
│                                            └─────────────────┘ │
│                                                     │           │
│                                                     ▼           │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │              Log Analytics Workspace                     │  │
│  │                                                            │  │
│  │  - Kusto Query Language (KQL) queries                    │  │
│  │  - Custom dashboards                                      │  │
│  │  - Alert rule evaluation                                  │  │
│  └──────────────────────────────────────────────────────────┘  │
│                              │                                   │
│                              ▼                                   │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │                Azure Monitor Alerts                      │  │
│  │                                                            │  │
│  │  - CPU > 80%                                              │  │
│  │  - Memory > 85%                                           │  │
│  │  - HTTP 5xx > 10/min                                      │  │
│  │  - Response Time > 2s                                     │  │
│  │  - Database Connection Failures                           │  │
│  └──────────────────────────────────────────────────────────┘  │
│                              │                                   │
│                              ▼                                   │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │              Monitor Action Group                        │  │
│  │                                                            │  │
│  │  - Email notifications                                    │  │
│  │  - Creates GitHub Issues (via Azure Monitor)            │  │
│  │  - Triggers automated remediation                         │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                   │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │                   Azure Key Vault                        │  │
│  │                                                            │  │
│  │  - Database connection strings                            │  │
│  │  - Redis connection strings                               │  │
│  │  - Application Insights keys                              │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

Components

Frontend (React + TypeScript + Vite)

Technology Stack:

React 18 with TypeScript
Vite for build tooling
TanStack Query for data fetching and caching
Tailwind CSS for styling
Axios for HTTP requests

Responsibilities:

Render todo list UI
Handle user interactions (create, update, delete, toggle)
Filter todos by status and priority
Display real-time updates

Key Features:

Responsive design with dark theme
Optimistic UI updates
Toast notifications for user feedback
Priority color coding (High=red, Medium=yellow, Low=green)

Pages and Routes:

/dashboard — Stats cards, recent tasks, priority breakdown, quick links
/todos — Task list with CRUD via modal, search, status/priority filters
/projects — Project cards with status badges, create project modal
/projects/:id — Project details with tasks, team members, priority breakdown
/users — Team member list with search and role filter
/users/:id — User profile with assigned tasks, performance metrics, account info

Backend API (Node.js + Express + TypeScript)

Technology Stack:

Node.js 20 LTS
Express 4.18
TypeScript 5.3
Prisma ORM 5.7
Winston for logging
Application Insights SDK

Responsibilities:

RESTful API for todo operations
Health check endpoints
Request/response logging
Error handling and monitoring
Rate limiting

API Endpoints:

Todo Management:

GET /api/todos - List all todos
POST /api/todos - Create new todo
GET /api/todos/:id - Get single todo
PATCH /api/todos/:id - Update todo
DELETE /api/todos/:id - Delete todo
PATCH /api/todos/:id/toggle - Toggle completion status

Health Checks:

GET /api/health - Basic health check
GET /api/health/detailed - Detailed health with dependencies
GET /api/health/memory - Memory usage statistics
GET /api/health/cpu - CPU usage information
GET /api/health/ready - Kubernetes readiness probe
GET /api/health/live - Kubernetes liveness probe

Database (PostgreSQL Flexible Server)

Configuration:

PostgreSQL 16
SKU: B_Standard_B1ms (Burstable, 1 vCore, 2GB RAM)
Storage: 32 GB
Backup retention: 7 days

Schema:

model Todo {
  id          String         @id @default(uuid())
  title       String
  description String?
  completed   Boolean        @default(false)
  priority    Priority       @default(MEDIUM)
  createdAt   DateTime       @default(now())
  updatedAt   DateTime       @updatedAt
  tags        Tag[]
  metadata    TodoMetadata?
}

model Tag {
  id     String @id @default(uuid())
  name   String @unique
  color  String
  todoId String
  todo   Todo   @relation(fields: [todoId], references: [id])
}

model TodoMetadata {
  id          String   @id @default(uuid())
  todoId      String   @unique
  category    String?
  estimatedTime Int?
  actualTime   Int?
  notes       String?
  todo        Todo     @relation(fields: [todoId], references: [id])
}

Intentional Issues:

Missing indexes on title and description columns (Scenario 3)
N+1 query patterns in API endpoints (Scenario 2)

Cache (Azure Redis Cache)

Configuration:

SKU: Basic C0 (250 MB)
TLS 1.2 required
Authentication enabled

Cache Strategy:

Cache todo lists with 5-minute TTL
Invalidate cache on create/update/delete operations
Use Redis as session store
Cache frequently accessed todos

Cache Keys:

todos:all - All todos list
todos:completed - Completed todos
todos:pending - Pending todos
todo:{id} - Individual todo

Intentional Issues:

Cache invalidation bug in update endpoint (Scenario 8)
Connection pool exhaustion scenario (Scenario 4)

Monitoring (Application Insights + Log Analytics)

Telemetry Collection:

HTTP requests and responses
Custom events for business operations
Exception tracking with stack traces
Performance metrics (CPU, memory, response time)
Dependency tracking (database, Redis, external APIs)

Custom Metrics:

todos_created - Counter for new todos
todos_completed - Counter for completed todos
cache_hit_rate - Cache effectiveness
api_response_time - Response time distribution

Kusto Queries:

// High error rate detection
requests
| where timestamp > ago(5m)
| summarize 
    total = count(),
    errors = countif(resultCode >= 500)
| extend error_rate = (errors * 100.0) / total
| where error_rate > 5

// Slow queries
dependencies
| where type == "SQL"
| where duration > 2000
| summarize count() by operation_Name, bin(timestamp, 5m)

// Memory usage trend
performanceCounters
| where name == "% Processor Time"
| summarize avg(value) by bin(timestamp, 1m)

Alert Rules

CPU Alert:

Metric: CpuPercentage
Threshold: > 80%
Window: 5 minutes
Frequency: 1 minute
Severity: 2 (Warning)

Memory Alert:

Metric: MemoryPercentage
Threshold: > 85%
Window: 5 minutes
Frequency: 1 minute
Severity: 2 (Warning)

HTTP Error Alert:

Metric: Http5xx
Threshold: > 10 per minute
Window: 5 minutes
Frequency: 1 minute
Severity: 1 (Error)

Response Time Alert:

Metric: ResponseTime
Threshold: > 2 seconds
Window: 5 minutes
Frequency: 1 minute
Severity: 2 (Warning)

Infrastructure as Code (Terraform)

Resources Managed:

Resource Group
PostgreSQL Flexible Server
Azure Redis Cache
App Service Plan
Linux Web App (Backend)
Static Web App (Frontend)
Application Insights
Log Analytics Workspace
Key Vault
Monitor Action Group
Metric Alerts
Diagnostic Settings

State Management:

Remote state stored in Azure Storage (optional)
State locking with blob lease
Sensitive outputs marked appropriately

Drift Detection:

Scheduled daily runs via GitHub Actions
Detects manual changes in Azure Portal
Creates GitHub issues for drift alerts
Scenario 10 demonstration capability

CI/CD Pipeline

Frontend Pipeline

Checkout code
Install dependencies
Run linter (continue on error for demo)
Build with Vite
Deploy to Azure Static Web Apps

Backend Pipeline

Checkout code
Install dependencies
Generate Prisma Client
Run linter (continue on error for demo)
Build TypeScript
Build Docker image
Push to Azure Container Registry
Deploy to App Service
Run database migrations
Health check verification

Infrastructure Pipeline

Terraform format check
Terraform init
Terraform validate
Terraform plan (post to PR)
Manual approval for apply (production)
Terraform apply
Drift detection (scheduled)

E2E Testing Architecture

The project includes a comprehensive end-to-end testing framework using Playwright with AI-powered test generation through GitHub Copilot custom agents.

Testing Stack

┌─────────────────────────────────────────────────────────────┐
│              Copilot Custom Agents                         │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  │
│  │   Explorer   │  │   Planner    │  │ Implementer  │  │
│  │  (MCP Nav)   │─▶│ (Test Plan)  │─▶│ (Code Gen)   │  │
│  └──────────────┘  └──────────────┘  └──────────────┘  │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                 Playwright Test Runner                      │
│                                                             │
│  ┌───────────────┐  ┌───────────────┐  ┌──────────────┐  │
│  │  Page Object  │  │  Test Specs   │  │   Fixtures   │  │
│  │    Models    │  │  (*.spec.ts)  │  │  (Mock Data)  │  │
│  └───────────────┘  └───────────────┘  └──────────────┘  │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│          Frontend (Vite Dev Server :5173)                    │
│          API Mocks via page.route()                         │
└─────────────────────────────────────────────────────────────┘

Test Directory Structure

e2e/
  playwright.config.ts    # Test runner configuration
  tsconfig.json           # TypeScript configuration
  package.json            # Dependencies (@playwright/test)
  pages/                  # Page Object Models
    layout.page.ts        #   Navigation, header, mobile menu
    dashboard.page.ts     #   Stats cards, links, priority breakdown
    todos.page.ts         #   Task list, filters, search, create modal
    projects.page.ts      #   Project list, create modal
    users.page.ts         #   User list, search, role filter
  tests/                  # Test specifications
    navigation.spec.ts    #   Routing, menu, active states, redirects
    dashboard.spec.ts     #   Stats rendering, card links
    todos.spec.ts         #   CRUD, filtering, search, toggle
    projects.spec.ts      #   CRUD, detail view, members
    users.spec.ts         #   List, search, detail view
  fixtures/               # Shared resources
    mock-data.ts          #   Mock API responses for todos, projects, users

Key Design Decisions

API Mocking Strategy:

All E2E tests use page.route() to intercept API calls
Mock responses are defined in shared fixture files
Tests never depend on a running backend — only the Vite dev server
Different mock data for testing states: empty, populated, error

Selector Strategy (priority order):

getByRole() — buttons, links, headings, comboboxes
getByText() — visible text content
getByPlaceholder() — form input placeholders
getByLabel() — labeled form fields
locator('[data-testid="..."]') — fallback only

Multi-Viewport Testing:

Desktop Chrome (default viewport)
Mobile iPhone 13 (tests responsive layout, mobile menu)

Copilot Agent Pipeline

The test generation process is automated through 4 specialized agents:

Explorer (playwright-explorer.agent.md)
- Uses Playwright MCP to navigate the live application
- Takes snapshots of each page
- Documents interactive elements with accessible selectors
- Outputs .testagent/exploration.md
Planner (playwright-planner.agent.md)
- Reads exploration findings and source code
- Organizes tests into incremental phases
- Defines Page Objects, mock data, and test cases
- Outputs .testagent/e2e-plan.md
Implementer (playwright-implementer.agent.md)
- Implements one phase at a time
- Creates Page Object Models in e2e/pages/
- Writes test specs with API mocks in e2e/tests/
- Runs tests and fixes failures (up to 3 retries)
Tester (Orchestrator) (playwright-tester.agent.md)
- Coordinates the full pipeline: Explorer → Planner → Implementer
- Ensures dev server is running
- Validates all tests pass before reporting

Running Tests

# From the frontend directory
npm run test:e2e          # Run all tests (headless)
npm run test:e2e:ui       # Interactive UI mode
npm run test:e2e:headed   # Visible browser window

# From the e2e directory
cd e2e
npx playwright test                        # All tests
npx playwright test tests/todos.spec.ts    # Specific spec
npx playwright test --project=mobile       # Mobile viewport only
npx playwright test --list                 # List all tests
npx playwright show-report                 # View HTML report

Security Considerations

Current Configuration (Demo Mode)

⚠️ For demonstration purposes only - NOT production-ready:

PostgreSQL allows all IP addresses (0.0.0.0/0)
CORS allows all origins (*)
Error handler exposes sensitive data in dev mode
Soft delete purge enabled on Key Vault
No network isolation or private endpoints

Production Recommendations

Network Security:

Use Virtual Network integration
Deploy with private endpoints
Implement Network Security Groups (NSGs)
Enable Azure Firewall

Authentication & Authorization:

Implement Azure AD authentication
Use Managed Identity for service-to-service auth
Rotate secrets regularly
Use Azure RBAC for access control

Data Protection:

Enable encryption at rest and in transit
Implement data retention policies
Use Azure Backup for disaster recovery
Enable geo-replication for critical data

Monitoring & Compliance:

Enable Azure Security Center
Implement Azure Policy for governance
Use Azure Sentinel for SIEM
Regular security audits

Scalability Considerations

Current Limitations (Demo Tier)

Single instance App Service (no auto-scale)
Basic Redis Cache (250 MB, no clustering)
Burstable PostgreSQL (1 vCore)

Production Scaling Strategy

Horizontal Scaling:

Auto-scale App Service based on CPU/Memory
Use Azure Front Door for global distribution
Implement read replicas for PostgreSQL
Use Redis cluster mode for cache

Vertical Scaling:

Upgrade App Service to Premium tier
Increase PostgreSQL to General Purpose tier
Upgrade Redis to Premium tier with persistence

Performance Optimization:

Implement CDN for static assets
Use connection pooling
Implement query result caching
Optimize database indexes
Use compression for API responses

Disaster Recovery

Backup Strategy:

PostgreSQL automated backups (7-day retention)
Point-in-time restore capability
Infrastructure as Code for environment recreation
Application Insights data retention (30 days)

Recovery Procedures:

Restore database from backup
Deploy infrastructure from Terraform
Deploy application from latest Docker image
Verify health checks
Update DNS if needed

RPO/RTO Targets:

Recovery Point Objective (RPO): 24 hours
Recovery Time Objective (RTO): 4 hours

Cost Optimization

Current Monthly Estimate (Dev Environment):

App Service Basic (B1): ~$13
PostgreSQL Flexible Server (B1ms): ~$12
Redis Cache (Basic C0): ~$17
Static Web App (Free): $0
Application Insights: ~$5
Log Analytics: ~$2
Total: ~$49/month

Cost Saving Recommendations:

Use Azure Dev/Test pricing
Shut down non-production environments after hours
Use Azure Reservations for production
Implement resource tagging for cost tracking
Review Log Analytics retention policies

Monitoring Dashboard

Key Metrics to Display:

Application Health
- Request rate (requests/min)
- Error rate (%)
- Average response time (ms)
- Active users
Infrastructure Health
- CPU usage (%)
- Memory usage (%)
- Database connections
- Redis cache hit rate
Business Metrics
- Todos created (count)
- Todos completed (count)
- Alert frequency

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

System Architecture

Overview

Architecture Diagram

Components

Frontend (React + TypeScript + Vite)

Backend API (Node.js + Express + TypeScript)

Database (PostgreSQL Flexible Server)

Cache (Azure Redis Cache)

Monitoring (Application Insights + Log Analytics)

Alert Rules

Infrastructure as Code (Terraform)

CI/CD Pipeline

Frontend Pipeline

Backend Pipeline

Infrastructure Pipeline

E2E Testing Architecture

Testing Stack

Test Directory Structure

Key Design Decisions

Copilot Agent Pipeline

Running Tests

Security Considerations

Current Configuration (Demo Mode)

Production Recommendations

Scalability Considerations

Current Limitations (Demo Tier)

Production Scaling Strategy

Disaster Recovery

Cost Optimization

Monitoring Dashboard

References

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

System Architecture

Overview

Architecture Diagram

Components

Frontend (React + TypeScript + Vite)

Backend API (Node.js + Express + TypeScript)

Database (PostgreSQL Flexible Server)

Cache (Azure Redis Cache)

Monitoring (Application Insights + Log Analytics)

Alert Rules

Infrastructure as Code (Terraform)

CI/CD Pipeline

Frontend Pipeline

Backend Pipeline

Infrastructure Pipeline

E2E Testing Architecture

Testing Stack

Test Directory Structure

Key Design Decisions

Copilot Agent Pipeline

Running Tests

Security Considerations

Current Configuration (Demo Mode)

Production Recommendations

Scalability Considerations

Current Limitations (Demo Tier)

Production Scaling Strategy

Disaster Recovery

Cost Optimization

Monitoring Dashboard

References