Refactor docker healthchecks from Redis to RabbitMQ and add worker protections #1050
+183
−4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Ports healthcheck and worker protection features from PR #1024 to use RabbitMQ (actual broker) instead of Redis, while maintaining debugpy configuration from main.
List of Changes
kombu.Connectionfor broker connectivity testingprocpsin Dockerfiles for process monitoring viapgrep--max-tasks-per-child=50,--max-memory-per-child=4000000autohealservice for automatic container restart on health failureCELERY_BROKER_URLwithin Python context.format()for Python < 3.6 compatibilityRelated Issues
Relates to #1024
Detailed Description
PR #1024 implemented healthchecks using
redis-clito test Redis connectivity, but this system uses RabbitMQ as the Celery broker. This PR adapts those changes:Healthcheck Implementation:
Debugpy Integration:
Maintained main's
DEBUGGERenv var approach (port 5679) rather than PR #1024'sCELERY_DEBUG(port 5678).Security Fix:
Original PR passed
CELERY_BROKER_URLto Python via shell variable expansion. Now accessed directly fromos.environwithin Python to prevent injection attacks.Autoheal Configuration:
Monitors container health every 10s, restarts unhealthy containers after 3 consecutive failures (90s total).
How to Test the Changes
Verify healthcheck:
Test unhealthy detection:
Test debugpy:
DEBUGGER=1 docker compose up celeryworker # Attach debugger to localhost:5679Screenshots
N/A - Infrastructure change
Deployment Notes
Docker Compose deployments:
Autoheal monitors containers with
autoheal=truelabel. No configuration changes required beyond deploying updated compose files.Kubernetes/Swarm (future):
Healthcheck scripts are orchestration-agnostic. Use as liveness probes directly.
Tuning parameters:
--max-tasks-per-childand--max-memory-per-childin worker start scriptsChecklist
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.