-
Notifications
You must be signed in to change notification settings - Fork 34
Closed
Labels
Description
Problem
The touchpoints-production-sidekiq-worker app is crashing repeatedly (every ~16 minutes) and showing 0/1 instances running. This prevents ALL background jobs from processing, including:
- Export jobs (form responses, events, versions, digital service accounts)
- Email notifications
- Scheduled tasks
Impact
- No background jobs are being processed in production
- Users requesting data exports never receive their files
- Scheduled jobs via
cf run-taskmay also be affected - Cloud Foundry is sending continuous "sidekiq worker failed" emails
Root Cause
The sidekiq worker app is configured to run bin/rails server (Rails web server) instead of bundle exec sidekiq.
Evidence
$ cf app touchpoints-production-sidekiq-worker
# instances: 0/1 # Worker is DOWN
$ cf events touchpoints-production-sidekiq-worker | head -5
# Shows continuous crashes with "APP/PROC/WEB: Exited with status 1"
$ cf curl /v3/apps/$(cf app touchpoints-production-sidekiq-worker --guid)/droplets/current
# Shows process type: "web":"bin/rails server -b 0.0.0.0 -p $PORT -e $RAILS_ENV"
# Should be: "worker":"bundle exec sidekiq"Why This Causes Crashes
- The web server starts but expects HTTP traffic
- No route exists to send traffic to the sidekiq worker
- Without traffic, the process appears unhealthy to Cloud Foundry's health check
- CF kills and restarts it repeatedly
- Since there's no manifest or command override, the ruby buildpack defaults to
rails server
Proposed Solution
Fix 1: Update Deploy Script (Immediate Fix - Recommended)
Modify .circleci/deploy-sidekiq.sh to pass the correct command:
# Line 131-133: Add -c flag with sidekiq command
if cf push "$app_name" \
-t 180 \
-c "bundle exec sidekiq -C config/sidekiq.yml" \
--health-check-type process; thenThis is the fastest fix with the smallest change surface.
Fix 2: Create Separate Sidekiq Manifests (Alternative)
Create manifest files for each environment:
touchpoints-production-sidekiq.ymltouchpoints-staging-sidekiq.ymltouchpoints-demo-sidekiq.yml
Each with:
applications:
- name: touchpoints-production-sidekiq-worker
command: bundle exec sidekiq -C config/sidekiq.yml
memory: 4G
# ... other configsFix 3: Create Procfile (Best Practice - Most Change)
Create Procfile with multiple process types:
web: bundle exec rails s -b 0.0.0.0 -p $PORT -e $RAILS_ENV
worker: bundle exec sidekiq -C config/sidekiq.ymlThen update manifests to use different process types.
Implementation Plan
Phase 1: Fix Production (URGENT)
- Update
.circleci/deploy-sidekiq.shto include sidekiq command - Deploy to production
- Verify worker starts:
cf app touchpoints-production-sidekiq-worker(should show 1/1) - Check logs:
cf logs touchpoints-production-sidekiq-worker --recent - Verify job processing in Sidekiq Web UI at
/admin/sidekiq
Phase 2: Fix Staging and Demo
- Same deployment script update applies to all environments
- Deploy to staging:
touchpoints-staging-sidekiq-worker - Deploy to demo:
touchpoints-demo-sidekiq-worker - Verify each environment
Phase 3: Improvements (Follow-up)
- Add error handling and retry policies to export jobs
- Configure monitoring for job failures
- Add user-facing error notifications
- Consider increasing concurrency if needed
Verification Checklist
- Sidekiq worker shows
instances: 1/1(not0/1) - Worker logs show "Sidekiq" startup message, not "Rails" server
- Export jobs complete successfully
- No more "sidekiq worker failed" emails
- Sidekiq Web UI shows active workers processing jobs
- Staging and demo workers also fixed
Current Status
- Production: BROKEN (0/1 instances, continuous crashes)
- Staging: Likely broken (same deploy script)
- Demo: Likely broken (same deploy script)
Related Files
.circleci/deploy-sidekiq.sh- deployment script (needs -c flag)config/sidekiq.yml- sidekiq configuration (concurrency: 1, queues: default, mailers)app/jobs/- all background jobs currently not processing
References
manifest.sample.ymlline 12: Contains commented example of correct sidekiq commandconfig/initializers/vcap_services.rb- Sets up Redis connection from CF servicesconfig/initializers/sidekiq.rb- Configures Sidekiq Redis connection