Skip to content

Latest commit

 

History

History
460 lines (317 loc) · 9.29 KB

File metadata and controls

460 lines (317 loc) · 9.29 KB

JudgeFinder Emergency Rollback Procedure

Domain: https://judgefinder.io Platform: Netlify Rollback Time: < 5 minutes


When to Rollback

Execute immediate rollback if:

  1. Site is completely down (500 errors)
  2. Judges directory fails to load (critical functionality)
  3. Database connection errors prevent core features
  4. Authentication completely broken
  5. Data corruption or incorrect data display
  6. Security vulnerability exposed
  7. Performance degradation > 50% (load time > 10 seconds)

Rollback Methods

Method 1: Netlify Dashboard (Fastest - 2 minutes)

Steps:

  1. Open Netlify Dashboard:

    https://app.netlify.com/sites/judgefinder/deploys
    
  2. Find Previous Working Deploy:

    • Look for the deploy BEFORE the current one
    • Verify it has "Published" status
    • Note the deploy ID and timestamp
  3. Click Rollback:

    • Click on the previous deploy
    • Click "Publish deploy" button
    • Confirm in modal dialog
  4. Wait for Activation:

    • Takes 30-60 seconds
    • Watch for "Published" badge
  5. Verify:

    curl -I https://judgefinder.io
    # Should return HTTP 200

Visual Guide:

Netlify Dashboard > Site: judgefinder
  ↓
Deploys tab
  ↓
Find previous "Published" deploy
  ↓
Click deploy > "Publish deploy"
  ↓
Confirm > Wait 30-60 seconds
  ↓
Site restored

Method 2: Netlify CLI (Command Line - 3 minutes)

Prerequisites:

# Install Netlify CLI (if not installed)
npm install -g netlify-cli

# Login to Netlify
netlify login

Rollback Commands:

# Link to site
netlify link --site judgefinder

# List recent deploys
netlify api listSiteDeploys | jq '.[] | {id: .id, state: .state, created_at: .created_at}' | head -20

# Rollback to previous deploy (automatic)
netlify api rollbackSiteDeploy --site-id $NETLIFY_SITE_ID

# OR rollback to specific deploy ID
netlify api restoreSiteDeploy --deploy-id [DEPLOY_ID]

Example:

# Rollback to previous
SITE_ID=$(netlify api getSite | jq -r '.id')
netlify api rollbackSiteDeploy --site-id $SITE_ID

# Wait for completion
sleep 30

# Verify
curl -I https://judgefinder.io

Method 3: GitHub Actions Workflow (4 minutes)

Steps:

  1. Go to GitHub Actions:

    https://github.com/[owner]/[repo]/actions/workflows/rollback.yml
    
  2. Trigger Rollback Workflow:

    • Click "Run workflow" button
    • Select branch: main
    • Fill in inputs:
      • deployment_id: Leave empty for previous deploy
      • reason: "Critical issue: [describe problem]"
    • Click green "Run workflow" button
  3. Monitor Workflow:

    • Watch workflow execution in Actions tab
    • Wait for all jobs to complete (2-3 minutes)
  4. Verify:


Rollback Verification Checklist

After rollback, verify these critical functions:

# 1. Site is live
curl -I https://judgefinder.io
# Expected: HTTP/2 200

# 2. Judges API works
curl "https://judgefinder.io/api/judges/list?page=1&limit=5"
# Expected: JSON array with judges

# 3. Health check
curl https://judgefinder.io/api/health
# Expected: {"status":"ok"}

Manual Browser Checks:

  • Home page loads
  • /judges page displays 4,732 judges
  • Search functionality works
  • Authentication pages load
  • No console errors

Post-Rollback Actions

Immediate (Within 5 minutes)

  1. Verify Site Stability:

    • Test critical pages
    • Check function logs in Netlify
    • Monitor error rates
  2. Document the Rollback:

    # Rollback Report
    
    **Date:** [timestamp]
    **Reason:** [why rollback was needed]
    **Method:** [Dashboard/CLI/GitHub Actions]
    **Rolled Back From:** [failed deploy ID]
    **Rolled Back To:** [previous working deploy ID]
    **Duration:** [time to complete rollback]
    **Verified By:** [your name]
    
    ## Issues in Failed Deploy
    
    [Describe what went wrong]
    
    ## Verification After Rollback
    
    - Site status: [PASS/FAIL]
    - Critical functions: [PASS/FAIL]
    - User impact: [description]
  3. Notify Team:

    • Email: ryan@judgefinder.io
    • Subject: "ROLLBACK EXECUTED - judgefinder.io"
    • Include rollback report
    • Attach screenshots if available

Short-term (Within 1 hour)

  1. Analyze Failed Deployment:

    • Review Netlify deploy logs
    • Check function error logs
    • Examine database migration status
    • Review commit changes that caused issue
  2. Create GitHub Issue:

    # Deployment Failed - Rollback Executed
    
    **Deploy ID:** [failed deploy ID]
    **Commit SHA:** [git commit hash]
    **Rollback Time:** [timestamp]
    
    ## Symptoms
    
    [What failed during deployment]
    
    ## Cause
    
    [Root cause analysis]
    
    ## Fix Required
    
    [What needs to be fixed before redeploying]
    
    ## Verification Steps
    
    [How to test the fix locally]
  3. Fix in Development:

    • Create new branch: fix/deployment-issue-[date]
    • Fix the issue locally
    • Test thoroughly before redeploying

Before Next Deployment

  1. Enhanced Testing:

    • Run full test suite locally
    • Build succeeds: npm run build
    • TypeScript passes: npm run type-check
    • Linting passes: npm run lint
    • E2E tests pass: npm run test:e2e
  2. Staging Deployment (Optional):

    • Deploy to Netlify branch deploy first
    • Test on preview URL before production
    • Get approval from team
  3. Update Deployment Checklist:

    • Add checks to prevent same issue
    • Update PRE_DEPLOYMENT_CHECKLIST.md

Rollback Scenarios and Solutions

Scenario 1: TypeScript Build Error

Symptoms:

  • Build fails in Netlify
  • Site shows "Deploy failed" status
  • No new version published

Rollback: Not needed (site still on previous version)

Fix:

  1. Revert commit locally
  2. Fix TypeScript error
  3. Test build: npm run type-check && npm run build
  4. Push fix

Scenario 2: Runtime Error (Site Down)

Symptoms:

  • Site shows 500 errors
  • /judges page not loading
  • Console shows "Failed to fetch"

Rollback: IMMEDIATE (Method 1 - Dashboard)

Post-Rollback Investigation:

  1. Check Netlify function logs
  2. Verify database connection
  3. Check environment variables
  4. Review API endpoint changes

Scenario 3: Database Migration Issue

Symptoms:

  • Site loads but features broken
  • "Table not found" errors
  • Queries failing

Rollback: IMMEDIATE (Method 1 - Dashboard)

Post-Rollback:

  1. Rollback database migration (if applied)
    npm run migration:rollback
  2. Verify database schema
  3. Test migration locally
  4. Apply migration manually and test before redeploy

Scenario 4: Authentication Broken

Symptoms:

  • Cannot sign in/sign up
  • Clerk errors in console
  • Redirect loops

Rollback: IMMEDIATE (Method 1 - Dashboard)

Post-Rollback:

  1. Check Clerk environment variables in Netlify
  2. Verify CLERK_SECRET_KEY is set
  3. Check Clerk dashboard for API key validity
  4. Test authentication locally

Scenario 5: Performance Degradation

Symptoms:

  • Site very slow (> 10 second load)
  • High function execution time
  • Database query timeouts

Rollback: If critical (Method 1 - Dashboard)

Post-Rollback:

  1. Review database indexes
  2. Check API query efficiency
  3. Run performance benchmark: npm run benchmark:performance
  4. Optimize queries before redeploy

Emergency Contacts

Priority Order

  1. Developer On-Call:

  2. Netlify Support:

  3. Supabase Support:

  4. Clerk Support:


Rollback Checklist (Quick Reference)

Before Rollback:

  • Confirm issue is critical
  • Document error messages
  • Take screenshots
  • Note failed deploy ID

Execute Rollback:

  • Choose method (Dashboard fastest)
  • Rollback to previous deploy
  • Wait for activation (30-60s)

Verify Rollback:

  • Site responds HTTP 200
  • /judges displays 4,732 judges
  • Authentication works
  • API endpoints functional
  • No console errors

Post-Rollback:

  • Document rollback in report
  • Notify team
  • Create GitHub issue
  • Analyze root cause
  • Fix issue locally
  • Test fix thoroughly

Before Redeploying:

  • All tests pass locally
  • Build succeeds
  • Issue completely resolved
  • Additional checks added to prevent recurrence

Rollback Testing

Periodic Rollback Drills (Monthly):

Practice rollback procedure to ensure familiarity:

  1. Deploy to staging/preview
  2. Practice rollback using all 3 methods
  3. Time each method
  4. Update documentation if needed

Goal: Rollback execution time < 3 minutes


Additional Resources


Last Updated: 2025-11-12 Version: 1.0