Skip to content

Latest commit

 

History

History
402 lines (313 loc) · 13.4 KB

File metadata and controls

402 lines (313 loc) · 13.4 KB

ResilienceAtlas EC2 Deployment with AWS CodeDeploy

This directory contains scripts and configuration for deploying ResilienceAtlas to AWS EC2 instances using Docker Compose and AWS CodeDeploy.

Overview

The deployment architecture uses:

  • Single EC2 Instance for hosting both staging and production (cost-effective)
  • AWS CodeDeploy for automated deployments from GitHub
  • Amazon ECR for pre-built Docker images (fast deployments)
  • Application Load Balancer for routing traffic based on domain names
  • Docker Compose for container orchestration
  • GitHub Actions for CI/CD pipeline with optimized builds
  • S3 for storing deployment packages
  • Route53 for DNS management

Single-Instance Architecture

Both staging and production run on the same EC2 instance with isolated:

  • Directories: /opt/resilienceatlas-staging, /opt/resilienceatlas-production
  • Ports: Staging (3000/3001/5433), Production (4000/4001)
  • Docker Networks: Separate networks per environment
  • Container Names: Prefixed with environment name
GitHub Repository
    │
    ├─ Push to 'staging' branch
    │       │
    │       ▼
    │   GitHub Actions → S3 → CodeDeploy (staging group)
    │       │
    │       ▼
    │   EC2 Instance: /opt/resilienceatlas-staging
    │       ├─ Frontend Container (port 3000)
    │       ├─ Backend Container (port 3001)
    │       └─ Database Container (port 5432)
    │
    └─ Push to 'main' branch
            │
            ▼
        GitHub Actions → S3 → CodeDeploy (production group)
            │
            ▼
        EC2 Instance: /opt/resilienceatlas-production
            ├─ Frontend Container (port 4000)
            └─ Backend Container (port 4001)
            (Production uses external PostgreSQL)

Quick Setup

Prerequisites

  1. AWS CLI configured with appropriate credentials
  2. Python 3.8+ with pip
  3. EC2 instance already running (single instance for both environments)
  4. External PostgreSQL for production database

AWS Profile Support

All scripts support the --profile (or -p) option to use a named profile from ~/.aws/credentials:

# Use a specific AWS profile
python3 setup_codedeploy.py --profile resilienceatlas

# Or use the short form
python3 setup_s3_bucket.py -p resilienceatlas

Example ~/.aws/credentials file:

[default]
aws_access_key_id = AKIA...
aws_secret_access_key = ...

[resilienceatlas]
aws_access_key_id = AKIA...
aws_secret_access_key = ...

If no profile is specified, the default credentials are used.

Complete Infrastructure Setup

  1. Install dependencies:

    cd scripts
    pip install -r requirements.txt
  2. Run the setup scripts in order:

    # Step 1: Create GitHub OIDC provider and IAM role (RECOMMENDED)
    python3 setup_github_oidc.py
    
    # Step 2: Create EC2 instance role for CodeDeploy and ECR access
    python3 setup_ec2_instance_role.py
    
    # Step 3: Create S3 bucket for deployments
    python3 setup_s3_bucket.py
    
    # Step 4: Setup ECR permissions for GitHub Actions and EC2
    python3 setup_ecr_permissions.py
    
    # Step 5: Create ECR repositories for Docker images
    python3 setup_ecr_repositories.py --region us-east-1
    
    # Step 6: Create CodeDeploy application and deployment groups
    python3 setup_codedeploy.py
    
    # Step 7: Set up Application Load Balancer (optional, if not using existing)
    python3 setup_alb.py <vpc-id>
  3. On the EC2 instance, install the CodeDeploy agent:

    sudo bash scripts/install-codedeploy-agent.sh
  4. Tag your EC2 instance for deployments:

    • Required tag: CodeDeploy=ResilienceAtlas
    • This tag identifies which instance(s) receive CodeDeploy deployments
    • Other instances can have Project=ResilienceAtlas without receiving deployments
  5. Configure GitHub Actions secrets (see below)

Scripts Reference

Infrastructure Setup Scripts

Script Description
setup_github_oidc.py Creates GitHub OIDC provider and IAM role for secure authentication
setup_ec2_instance_role.py Creates IAM role for EC2 instances with CodeDeploy and ECR access
setup_s3_bucket.py Creates S3 bucket for deployment packages
setup_ecr_permissions.py Adds ECR permissions to GitHub Actions OIDC and EC2 instance roles
setup_ecr_repositories.py Creates ECR repositories with lifecycle policies for Docker images
setup_codedeploy.py Creates CodeDeploy application and deployment groups
setup_alb.py Creates Application Load Balancer and target groups
install-codedeploy-agent.sh Installs CodeDeploy agent on EC2 instances

CodeDeploy Lifecycle Hook Scripts

Located in scripts/codedeploy/:

Script Phase Description
common.sh - Shared functions used by all hooks
application-stop.sh ApplicationStop Stops existing containers gracefully
before-install.sh BeforeInstall Prepares environment, creates directories
after-install.sh AfterInstall Builds Docker images, syncs database (staging)
application-start.sh ApplicationStart Starts containers, runs migrations
validate-service.sh ValidateService Performs health checks
sync-database.sh - Copies production database to staging

Data Setup Scripts

Script Description
setup_ldn_data.sh Downloads LDN datasets and geoBoundaries from S3, imports boundaries, dissolves geometries, and seeds the LDN scope. Run with --profile <aws-profile>.

Management Scripts

Script Description
manage_ec2_instances.py Instance management: status, start, stop, maintenance mode

GitHub Actions Workflows

Staging Deployment (codedeploy_staging.yml)

  • Triggered on pushes to staging branch
  • Optionally syncs production database to staging
  • Deploys via CodeDeploy to staging instance
  • Accessible at https://staging.resilienceatlas.org

Production Deployment (codedeploy_production.yml)

  • Triggered on pushes to main branch
  • Deploys via CodeDeploy to production instance
  • Uses external PostgreSQL database
  • Accessible at https://resilienceatlas.org

GitHub Actions Secrets Configuration

Required Secrets (Using OIDC - Recommended)

All environment configuration is managed through GitHub Secrets. During deployment, the GitHub Actions workflow generates .env.staging or .env.production files from these secrets, which are then included in the deployment package. No manual server configuration is required.

# AWS OIDC Authentication (no access keys needed!)
AWS_OIDC_ROLE_ARN          # IAM role ARN from setup_github_oidc.py
DEPLOYMENT_S3_BUCKET       # S3 bucket name from setup_s3_bucket.py

# ECR Configuration (automatically set by workflows)
ECR_REGISTRY               # Set by workflows, no manual config needed
BACKEND_IMAGE              # Set by workflows, no manual config needed
FRONTEND_IMAGE             # Set by workflows, no manual config needed

# Staging Environment
STAGING_DB_PASSWORD        # Postgres password for staging container
STAGING_SECRET_KEY_BASE    # Rails secret key (128 chars) - generate with: rails secret
STAGING_DEVISE_KEY         # Devise authentication key - generate with: rails secret

# Production Environment  
PRODUCTION_DATABASE_URL    # postgresql://user:pass@host:port/db
PRODUCTION_SECRET_KEY_BASE # Rails secret key (128 chars) - generate with: rails secret
PRODUCTION_DEVISE_KEY      # Devise authentication key - generate with: rails secret

# Application Configuration
NEXT_PUBLIC_GOOGLE_ANALYTICS  # Google Analytics ID
NEXT_PUBLIC_TRANSIFEX_TOKEN   # Transifex API token
NEXT_PUBLIC_TRANSIFEX_SECRET  # Transifex secret
NEXT_PUBLIC_GOOGLE_API_KEY    # Google Maps API key
RESILIENCE_API_KEY            # Resilience API key
SPARKPOST_API_KEY             # SparkPost email service key

# Error Tracking (optional)
ROLLBAR_ACCESS_TOKEN             # Rollbar server-side token
NEXT_PUBLIC_ROLLBAR_CLIENT_TOKEN # Rollbar client-side token

How Environment Variables Work

  1. GitHub Secrets → Stored securely in GitHub repository settings
  2. GitHub Actions → Generates .env.{environment} file with secrets during deployment
  3. Deployment Package → Env file is included in the zip archive
  4. CodeDeploy → Extracts package to server, env file is automatically available
  5. Docker Compose → Loads environment variables when starting containers

This approach ensures:

  • ✅ No secrets stored on the server
  • ✅ No manual server configuration needed
  • ✅ Secrets are rotated by updating GitHub Secrets and redeploying
  • ✅ Full audit trail in GitHub Actions logs (secrets are masked)

Deployment Flow

Optimized Deployment with ECR Pre-built Images

The deployment process has been optimized to use pre-built Docker images stored in Amazon ECR:

  1. GitHub Actions CI/CD Pipeline:

    • Builds Docker images for backend and frontend
    • Pushes images to Amazon ECR with tags (commit SHA + environment)
    • Creates deployment package with configuration
  2. Fast Deployment to EC2:

    • EC2 instance pulls pre-built images from ECR (1-2 minutes)
    • No local Docker builds required (saves 15-25 minutes)
    • Images are cached for even faster subsequent pulls

Staging Deployment

  1. Push to staging triggers GitHub Actions workflow
  2. Build and push Docker images to ECR (cached layers speed up rebuilds)
  3. Create deployment package (configuration and scripts only)
  4. Upload to S3 in staging folder
  5. Create CodeDeploy deployment
  6. CodeDeploy agent on staging instance:
    • Stops existing frontend/backend containers
    • Downloads deployment package from S3
    • Pulls pre-built images from ECR (fast!)
    • Syncs production database (if enabled)
    • Starts new containers
    • Runs database migrations
    • Performs health checks

Production Deployment

  1. Push to main triggers GitHub Actions workflow
  2. Build and push Docker images to ECR (cached layers speed up rebuilds)
  3. Create deployment package (configuration and scripts only)
  4. Upload to S3 in production folder
  5. Create CodeDeploy deployment
  6. CodeDeploy agent on production instance:
    • Stops existing containers
    • Downloads deployment package from S3
    • Pulls pre-built images from ECR (fast!)
    • Starts new containers
    • Runs database migrations
    • Performs health checks

Database Management

Staging Database

  • Runs in a Docker container (postgis/postgis:15-3.3)
  • Can be synced from production during deployment
  • Set SYNC_PRODUCTION_DB=true to enable sync (default)
  • Data persists in Docker volume between deployments

Production Database

  • Uses external PostgreSQL server
  • Configure via PRODUCTION_DATABASE_URL
  • Migrations run automatically during deployment

Manual Database Sync

To manually sync the staging database from production:

ssh ubuntu@<staging-ip>
cd /opt/resilienceatlas
PRODUCTION_DATABASE_URL="postgresql://..." ./scripts/codedeploy/sync-database.sh

Rollback Procedures

Automatic Rollback

CodeDeploy automatically rolls back if:

  • Deployment fails
  • Health checks fail
  • Any lifecycle hook script exits with non-zero code

Manual Rollback via CodeDeploy Console

  1. Go to AWS CodeDeploy console
  2. Select the deployment group
  3. Choose a previous successful deployment
  4. Click "Redeploy"

Manual Rollback via CLI

# Get deployment history
aws deploy list-deployments \
  --application-name resilienceatlas \
  --deployment-group-name resilienceatlas-staging

# Redeploy a previous revision
aws deploy create-deployment \
  --application-name resilienceatlas \
  --deployment-group-name resilienceatlas-staging \
  --revision revisionType=S3,s3Location={bucket=...,key=...,bundleType=zip}

Manual Rollback on Instance

ssh ubuntu@<instance-ip>
cd /opt/resilienceatlas

# Check backup commits
ls -la /opt/resilienceatlas-backups/

# Stop current containers
docker compose -f docker-compose.swarm.staging.yml down

# Checkout previous version (from backup .sha file)
git checkout <previous-commit>

# Rebuild and restart
docker compose -f docker-compose.swarm.staging.yml up -d --build

Monitoring and Troubleshooting

Check Deployment Status

# View deployment in AWS Console
aws deploy get-deployment --deployment-id <deployment-id>

# List recent deployments
aws deploy list-deployments \
  --application-name resilienceatlas \
  --deployment-group-name resilienceatlas-staging

View CodeDeploy Agent Logs

ssh ubuntu@<instance-ip>
sudo tail -f /var/log/aws/codedeploy-agent/codedeploy-agent.log

View Deployment Script Logs

ssh ubuntu@<instance-ip>
sudo tail -f /opt/codedeploy-agent/deployment-root/*/logs/scripts.log

View Application Logs

ssh ubuntu@<instance-ip>
cd /opt/resilienceatlas

# All container logs
docker compose -f docker-compose.swarm.staging.yml logs

# Specific container
docker compose -f docker-compose.swarm.staging.yml logs frontend
docker compose -f docker-compose.swarm.staging.yml logs backend
docker compose -f docker-compose.swarm.staging.yml logs database

Health Check Commands

# Frontend
curl -f http://localhost:3000

# Backend
curl -f http://localhost:3001/health

# Database (staging only)
docker compose -f docker-compose.swarm.staging.yml exec database pg_isready -U postgres