From 68362e4cdbfcd36c6c9088e7ff30ca75b991fb3b Mon Sep 17 00:00:00 2001 From: Karol Musur Date: Sun, 3 Nov 2024 14:58:34 +0100 Subject: [PATCH 1/7] feat: true zero-downtime deployment with request draining --- README.md | 61 ++++++++++++++++++-- docker-rollout | 13 +++++ examples/request-draining/Dockerfile | 5 ++ examples/request-draining/README.md | 11 ++++ examples/request-draining/docker-compose.yml | 24 ++++++++ 5 files changed, 109 insertions(+), 5 deletions(-) create mode 100644 examples/request-draining/Dockerfile create mode 100644 examples/request-draining/README.md create mode 100644 examples/request-draining/docker-compose.yml diff --git a/README.md b/README.md index 7601474..4ce68cc 100644 --- a/README.md +++ b/README.md @@ -12,6 +12,7 @@ Simply replace `docker compose up -d ` with `docker rollout ` - [Usage](#usage) - [⚠️ Caveats](#️-caveats) - [Sample deployment script](#sample-deployment-script) + - [True zero-downtime deployment with request draining](#true-zero-downtime-deployment-with-request-draining) - [Why?](#why) - [License](#license) @@ -49,6 +50,7 @@ Options: - `-w | --wait SECONDS` - (not required) - Time to wait for new container to be ready if healthcheck is not defined. Default: 10 - `--wait-after-healthy SECONDS` - (not required) - Time to wait after new container is healthy before removing old container. Works when healthcheck is defined. Default: 0 - `--env-file FILE` - (not required) - Path to env file, can be specified multiple times, as in `docker compose`. +- `--pre-stop-hook` - (not required) - Command to run in the old container before stopping it. Can be used for marking the container as unhealthy to make proxy stop sending requests to it, see [request draining](#true-zero-downtime-deployment-with-request-draining) below. See [examples](https://docker-rollout.wowu.dev/examples/) in docs for sample `docker-compose.yml` files. @@ -57,6 +59,7 @@ See [examples](https://docker-rollout.wowu.dev/examples/) in docs for sample `do - Your service cannot have `container_name` and `ports` defined in `docker-compose.yml`, as it's not possible to run multiple containers with the same name or port mapping. Use a proxy as described below. - Proxy like [Traefik](https://github.com/traefik/traefik) or [nginx-proxy](https://github.com/nginx-proxy/nginx-proxy) is required to route traffic. - Each deployment will increment the index in container name (e.g. `project-web-1` -> `project-web-2`). +- Some requests might still be failing in the short period between the old container is stopped and proxy stops sending requests to it. For most cases it's not a problem, but this can be fixed with [request draining](#true-zero-downtime-deployment-with-request-draining) described below, at the cost of a more complex setup. ### Sample deployment script @@ -68,17 +71,65 @@ git pull # Build new app image docker compose build web # Run database migrations -docker compose run web rake db:migrate -# Deploy new version +docker compose run --rm web rake db:migrate +# Deploy new version without downtime docker rollout web ``` +### True zero-downtime deployment with request draining + +If you want to make sure that no requests are lost during deployment, you can use the following setup to implement request draining. It requires adding a healthcheck to your container that will be made failing on purpose when performing rollout to make proxy (Traefik or nginx-proxy) stop sending requests to the old container before it's removed. + +1. Add additional healthcheck to your container. The check should fail when `/tmp/drain` file is present. + + If your service doesn't have a healthcheck yet: + + ```yml + services: + web: + image: myapp:latest + healthcheck: + test: ["CMD", "test", "!", "-f", "/tmp/drain"] + interval: 5s + retries: 1 + ``` + + If your service already has a healthcheck: + + ```yml + services: + web: + image: myapp:latest + healthcheck: + test: test ! -f /tmp/drain && curl -f http://localhost:3000/healthcheck + interval: 5s + retries: 1 + ``` + +2. Use the following command to perform a zero-downtime deployment: + + ```bash + docker rollout web --pre-stop-hook "touch /tmp/drain && sleep 10" + ``` + + **Important:** make sure the sleep time is longer than the healthcheck `interval` × `retries` + time to finish processing open requests (e.g. `interval: 10s`, `retries: 3`, additional time of 5s = `sleep 35`) so the healthcheck has enough time to mark the container as unhealthy. + +With this configuration, a rollout process will look like this: + +1. New container is started. +2. Docker daemon marks the old container as healthy. +3. Proxy starts sending requests to the new container alongside the old container. +4. We create `/tmp/drain` file in the old container. +5. Docker daemon marks the old container as unhealthy. +6. Proxy stops sending requests to the old container. +7. Old container is removed. + ## Why? -Using `docker compose up` to deploy a new version of a service causes downtime because the app container is stopped before the new container is created. -If your application takes a while to boot, this may be noticeable to users. +Using `docker compose up` to deploy a new version of your app causes downtime because the app container has to be stopped before the new container is created. +If your application takes a while to boot, this may be noticeable to your users. -Using container orchestration tools like [Kubernetes](https://kubernetes.io/) or [Nomad](https://www.nomadproject.io/) is usually an overkill for projects that will do fine with a single-server Docker Compose setup. [Dokku](https://github.com/dokku/dokku) comes with zero-downtime deployment and more useful features, but it's not as flexible as Docker Compose. +Using container orchestration tools like [Kubernetes](https://kubernetes.io/) or [Nomad](https://www.nomadproject.io/) can be an overkill for projects that will do fine with a single-server Docker Compose setup. [Dokku](https://github.com/dokku/dokku) comes with zero-downtime deployment and more useful features, but it's not as flexible as Docker Compose. If you have a proxy like [Traefik](https://github.com/traefik/traefik) or [nginx-proxy](https://github.com/nginx-proxy/nginx-proxy), a zero downtime deployment can be achieved by writing a script that scales the service to 2 instances, waits for the new container to be ready, and then removes the old container. `docker rollout` does exactly that, but with a single command that you can use in your deployment scripts. diff --git a/docker-rollout b/docker-rollout index a697cf7..52f362a 100755 --- a/docker-rollout +++ b/docker-rollout @@ -59,6 +59,7 @@ Options: --wait-after-healthy N When healthcheck is defined and succeeds, wait for additional N seconds before stopping the old container (default: 0 seconds) --env-file FILE Specify an alternate environment file + --pre-stop-hook CMD Run a command in the old container before stopping it. -v, --version Print plugin version EOF @@ -154,6 +155,15 @@ main() { echo "==> Stopping and removing old containers" + if [ -n "$PRE_STOP_HOOK" ]; then + echo "==> Running pre-stop hook: $PRE_STOP_HOOK" + + for OLD_CONTAINER_ID in $OLD_CONTAINER_IDS; do + # shellcheck disable=SC2086 # DOCKER_ARGS must be unquoted to allow multiple arguments + docker $DOCKER_ARGS exec "$OLD_CONTAINER_ID" sh -c "$PRE_STOP_HOOK" + done + fi + # shellcheck disable=SC2086 # DOCKER_ARGS and OLD_CONTAINER_IDS must be unquoted to allow multiple arguments docker $DOCKER_ARGS stop $OLD_CONTAINER_IDS # shellcheck disable=SC2086 # DOCKER_ARGS and OLD_CONTAINER_IDS must be unquoted to allow multiple arguments @@ -186,6 +196,9 @@ while [ $# -gt 0 ]; do WAIT_AFTER_HEALTHY_DELAY="$2" shift 2 ;; + --pre-stop-hook) + PRE_STOP_HOOK="$2" + shift 2 -v | --version) echo "docker-rollout version $VERSION" exit 0 diff --git a/examples/request-draining/Dockerfile b/examples/request-draining/Dockerfile new file mode 100644 index 0000000..54c9701 --- /dev/null +++ b/examples/request-draining/Dockerfile @@ -0,0 +1,5 @@ +# Copy whoami binary to alpine based image to have shell commands available +FROM alpine +COPY --from=traefik/whoami /whoami /whoami +ENTRYPOINT [ "/whoami" ] +EXPOSE 80 diff --git a/examples/request-draining/README.md b/examples/request-draining/README.md new file mode 100644 index 0000000..2eb38c5 --- /dev/null +++ b/examples/request-draining/README.md @@ -0,0 +1,11 @@ +# Request draining example + +1. Change domain in `docker-compose.yml` to a domain pointing to your server +2. Start all services + ```bash + docker-compose up -d + ``` +3. Deploy new version of `whoami` service without downtime + ```bash + docker rollout whoami --pre-stop-hook "touch /tmp/drain && sleep 10" + ``` diff --git a/examples/request-draining/docker-compose.yml b/examples/request-draining/docker-compose.yml new file mode 100644 index 0000000..4d62ffa --- /dev/null +++ b/examples/request-draining/docker-compose.yml @@ -0,0 +1,24 @@ +services: + whoami: + build: . + labels: + - "traefik.enable=true" + - "traefik.http.routers.whoami.entrypoints=web" + - "traefik.http.routers.whoami.rule=Host(`example.com`)" + healthcheck: + test: ["CMD", "test", "!", "-f", "/tmp/drain"] + interval: 5s + retries: 1 + traefik: + image: traefik:v2.9 + container_name: traefik + command: + - "--api.insecure=true" + - "--providers.docker=true" + - "--providers.docker.exposedbydefault=false" + - "--entrypoints.web.address=:80" + ports: + - "9080:80" + - "9088:8080" + volumes: + - "/var/run/docker.sock:/var/run/docker.sock:ro" From aeaae3d040476d8d7d13de932ed11bf44bcdaf1f Mon Sep 17 00:00:00 2001 From: Karol Musur Date: Sun, 3 Nov 2024 15:01:18 +0100 Subject: [PATCH 2/7] copy improvements --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 4ce68cc..7b74440 100644 --- a/README.md +++ b/README.md @@ -78,7 +78,7 @@ docker rollout web ### True zero-downtime deployment with request draining -If you want to make sure that no requests are lost during deployment, you can use the following setup to implement request draining. It requires adding a healthcheck to your container that will be made failing on purpose when performing rollout to make proxy (Traefik or nginx-proxy) stop sending requests to the old container before it's removed. +If you want to make sure that no requests are lost during deployment, you can use the following setup to implement request draining. It requires adding a healthcheck to your container that will be made failing on purpose when performing rollout to the make proxy (Traefik or nginx-proxy) stop sending requests to the old container before it's removed. 1. Add additional healthcheck to your container. The check should fail when `/tmp/drain` file is present. @@ -112,9 +112,9 @@ If you want to make sure that no requests are lost during deployment, you can us docker rollout web --pre-stop-hook "touch /tmp/drain && sleep 10" ``` - **Important:** make sure the sleep time is longer than the healthcheck `interval` × `retries` + time to finish processing open requests (e.g. `interval: 10s`, `retries: 3`, additional time of 5s = `sleep 35`) so the healthcheck has enough time to mark the container as unhealthy. + **Important:** make sure the sleep time is longer than the healthcheck `interval` × `retries` + `time to finish processing open requests` (e.g. interval: 10s, retries: 3, additional time of 5s = sleep 35) so the healthcheck has enough time to mark the container as unhealthy. -With this configuration, a rollout process will look like this: +With this configuration, a rollout process looks like this: 1. New container is started. 2. Docker daemon marks the old container as healthy. From 0e07f65e05dfd71846bec63edc81be8649221cfd Mon Sep 17 00:00:00 2001 From: Karol Musur Date: Wed, 21 May 2025 20:16:33 +0200 Subject: [PATCH 3/7] docs: move example to docs, add cli option --- README.md | 4 +- docs/cli-options.md | 15 +++++ docs/examples/request-draining.md | 69 ++++++++++++++++++++ docs/getting-started.md | 2 +- docs/index.md | 3 +- examples/request-draining/Dockerfile | 5 -- examples/request-draining/README.md | 11 ---- examples/request-draining/docker-compose.yml | 24 ------- 8 files changed, 89 insertions(+), 44 deletions(-) create mode 100644 docs/examples/request-draining.md delete mode 100644 examples/request-draining/Dockerfile delete mode 100644 examples/request-draining/README.md delete mode 100644 examples/request-draining/docker-compose.yml diff --git a/README.md b/README.md index 7b74440..c50e5b9 100644 --- a/README.md +++ b/README.md @@ -58,8 +58,8 @@ See [examples](https://docker-rollout.wowu.dev/examples/) in docs for sample `do - Your service cannot have `container_name` and `ports` defined in `docker-compose.yml`, as it's not possible to run multiple containers with the same name or port mapping. Use a proxy as described below. - Proxy like [Traefik](https://github.com/traefik/traefik) or [nginx-proxy](https://github.com/nginx-proxy/nginx-proxy) is required to route traffic. -- Each deployment will increment the index in container name (e.g. `project-web-1` -> `project-web-2`). -- Some requests might still be failing in the short period between the old container is stopped and proxy stops sending requests to it. For most cases it's not a problem, but this can be fixed with [request draining](#true-zero-downtime-deployment-with-request-draining) described below, at the cost of a more complex setup. +- Each deployment will increment the number in container name (e.g. `project-web-1` -> `project-web-2`). +- Some requests might still fail during the brief moment between when the old container is stopped and when the proxy stops sending traffic to it. In most cases, this isn't an issue, but you can fully prevent it by configuring [request draining](#true-zero-downtime-deployment-with-request-draining), which requires a slightly more complex setup. ### Sample deployment script diff --git a/docs/cli-options.md b/docs/cli-options.md index bcdb521..83094dc 100644 --- a/docs/cli-options.md +++ b/docs/cli-options.md @@ -100,3 +100,18 @@ Multiple env files: docker rollout --env-file .env --env-file .env.prod ``` +## `--pre-stop-hook COMMAND` + +Command to run in the old container before stopping it. Can be used for marking the container as unhealthy to make proxy stop sending requests to it, see [request draining](examples/request-draining.md) below. + +**Example** + +Deploy a new version of the service and mark the old container as unhealthy before stopping it: + +```bash +docker rollout --pre-stop-hook "touch /tmp/drain && sleep 10" +``` + +{: .warning } +This requires the service to have a healthcheck defined in `docker-compose.yml` or `Dockerfile` that will fail if `/tmp/drain` file exists. + diff --git a/docs/examples/request-draining.md b/docs/examples/request-draining.md new file mode 100644 index 0000000..70332a7 --- /dev/null +++ b/docs/examples/request-draining.md @@ -0,0 +1,69 @@ +--- +title: Request Draining +parent: Examples +--- + +# Request Draining with Traefik + +Works with Docker Compose v2. + +## Files + +`Dockerfile` + +```Dockerfile +FROM alpine +# Use alpine image with whoami binary to have shell commands available +COPY --from=traefik/whoami /whoami /whoami +ENTRYPOINT [ "/whoami" ] +EXPOSE 80 +``` + +`compose.yml` + +```yml +services: + whoami: + build: . + labels: + - "traefik.enable=true" + - "traefik.http.routers.whoami.entrypoints=web" + - "traefik.http.routers.whoami.rule=Host(`example.com`)" + healthcheck: + test: "test ! -f /tmp/drain" + interval: 5s + retries: 1 + + traefik: + image: traefik:v2.9 + container_name: traefik + command: + - "--api.insecure=true" + - "--providers.docker=true" + - "--providers.docker.exposedbydefault=false" + - "--entrypoints.web.address=:80" + ports: + - "80:80" + - "8080:8080" + volumes: + - "/var/run/docker.sock:/var/run/docker.sock:ro" + +``` + +## Steps + +1. Change domain in `compose.yml` to a domain pointing to your server. + +2. Start all services + + ```bash + docker compose up -d + ``` + +3. Deploy new version of `whoami` service without downtime + + ```bash + docker rollout whoami --pre-stop-hook "touch /tmp/drain && sleep 10" + ``` + + New container will be created, then the old container will be marked as unhealthy and removed after 10 seconds. Traefik will stop sending requests to the old container when it becomes unhealthy, allowing it to finish pending requests before being removed. diff --git a/docs/getting-started.md b/docs/getting-started.md index 60dd81e..0d281d7 100644 --- a/docs/getting-started.md +++ b/docs/getting-started.md @@ -110,7 +110,7 @@ git pull # Build new app image docker compose build web # Run database migrations -docker compose run web rake db:migrate +docker compose run --rm web rake db:migrate # Deploy new version docker rollout web ``` diff --git a/docs/index.md b/docs/index.md index 04c4b3d..98157b6 100644 --- a/docs/index.md +++ b/docs/index.md @@ -29,7 +29,8 @@ Using `docker compose up` to deploy a new version of a service causes downtime b - Your service cannot have `container_name` and `ports` defined in `docker-compose.yml`, as it's not possible to run multiple containers with the same name or port mapping. Use a proxy as described below. - Proxy like [Traefik](https://github.com/traefik/traefik) or [nginx-proxy](https://github.com/nginx-proxy/nginx-proxy) is required to route traffic to the containers. Refer to the [Examples](examples) for sample compose files. -- Each deployment will increment the index in container name (e.g. `project-web-1` -> `project-web-2`). +- Each deployment will increment the number in container name (e.g. `project-web-1` -> `project-web-2`). +- Some requests might still fail during the brief moment between when the old container is stopped and when the proxy stops sending traffic to it. In most cases, this isn't an issue, but you can fully prevent it by configuring request draining, which requires a slightly more complex setup. ## Installation diff --git a/examples/request-draining/Dockerfile b/examples/request-draining/Dockerfile deleted file mode 100644 index 54c9701..0000000 --- a/examples/request-draining/Dockerfile +++ /dev/null @@ -1,5 +0,0 @@ -# Copy whoami binary to alpine based image to have shell commands available -FROM alpine -COPY --from=traefik/whoami /whoami /whoami -ENTRYPOINT [ "/whoami" ] -EXPOSE 80 diff --git a/examples/request-draining/README.md b/examples/request-draining/README.md deleted file mode 100644 index 2eb38c5..0000000 --- a/examples/request-draining/README.md +++ /dev/null @@ -1,11 +0,0 @@ -# Request draining example - -1. Change domain in `docker-compose.yml` to a domain pointing to your server -2. Start all services - ```bash - docker-compose up -d - ``` -3. Deploy new version of `whoami` service without downtime - ```bash - docker rollout whoami --pre-stop-hook "touch /tmp/drain && sleep 10" - ``` diff --git a/examples/request-draining/docker-compose.yml b/examples/request-draining/docker-compose.yml deleted file mode 100644 index 4d62ffa..0000000 --- a/examples/request-draining/docker-compose.yml +++ /dev/null @@ -1,24 +0,0 @@ -services: - whoami: - build: . - labels: - - "traefik.enable=true" - - "traefik.http.routers.whoami.entrypoints=web" - - "traefik.http.routers.whoami.rule=Host(`example.com`)" - healthcheck: - test: ["CMD", "test", "!", "-f", "/tmp/drain"] - interval: 5s - retries: 1 - traefik: - image: traefik:v2.9 - container_name: traefik - command: - - "--api.insecure=true" - - "--providers.docker=true" - - "--providers.docker.exposedbydefault=false" - - "--entrypoints.web.address=:80" - ports: - - "9080:80" - - "9088:8080" - volumes: - - "/var/run/docker.sock:/var/run/docker.sock:ro" From 04b321ff575cd36f2592139d0906fc40c8e67f07 Mon Sep 17 00:00:00 2001 From: Karol Musur Date: Wed, 21 May 2025 20:19:04 +0200 Subject: [PATCH 4/7] fix: shellcheck --- docker-rollout | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docker-rollout b/docker-rollout index 52f362a..06efdd7 100755 --- a/docker-rollout +++ b/docker-rollout @@ -153,8 +153,6 @@ main() { sleep "$NO_HEALTHCHECK_TIMEOUT" fi - echo "==> Stopping and removing old containers" - if [ -n "$PRE_STOP_HOOK" ]; then echo "==> Running pre-stop hook: $PRE_STOP_HOOK" @@ -164,6 +162,8 @@ main() { done fi + echo "==> Stopping and removing old containers" + # shellcheck disable=SC2086 # DOCKER_ARGS and OLD_CONTAINER_IDS must be unquoted to allow multiple arguments docker $DOCKER_ARGS stop $OLD_CONTAINER_IDS # shellcheck disable=SC2086 # DOCKER_ARGS and OLD_CONTAINER_IDS must be unquoted to allow multiple arguments @@ -199,6 +199,7 @@ while [ $# -gt 0 ]; do --pre-stop-hook) PRE_STOP_HOOK="$2" shift 2 + ;; -v | --version) echo "docker-rollout version $VERSION" exit 0 From 7c5a7b2d5e12f9dd4d3b409dadf3a30afa0e6324 Mon Sep 17 00:00:00 2001 From: Karol Musur Date: Wed, 28 May 2025 20:43:55 +0200 Subject: [PATCH 5/7] feat: configure request draining via label --- README.md | 34 ++++++++++++---- docker-rollout | 9 ++++- docs/cli-options.md | 8 ++-- docs/examples/request-draining.md | 2 +- docs/index.md | 12 ++++-- docs/request-draining.md | 67 +++++++++++++++++++++++++++++++ docs/uninstalling.md | 2 +- docs/upgrading.md | 2 +- 8 files changed, 117 insertions(+), 19 deletions(-) create mode 100644 docs/request-draining.md diff --git a/README.md b/README.md index c50e5b9..b962bce 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,13 @@ -

+
+

docker rollout
Zero Downtime Deployment for Docker Compose

+[Docs](https://docker-rollout.wowu.dev) +
+ + Docker CLI plugin that updates Docker Compose services without downtime. Simply replace `docker compose up -d ` with `docker rollout ` in your deployment scripts. This command will scale the service to twice the current number of instances, wait for the new containers to be ready, and then remove the old containers. @@ -59,7 +64,7 @@ See [examples](https://docker-rollout.wowu.dev/examples/) in docs for sample `do - Your service cannot have `container_name` and `ports` defined in `docker-compose.yml`, as it's not possible to run multiple containers with the same name or port mapping. Use a proxy as described below. - Proxy like [Traefik](https://github.com/traefik/traefik) or [nginx-proxy](https://github.com/nginx-proxy/nginx-proxy) is required to route traffic. - Each deployment will increment the number in container name (e.g. `project-web-1` -> `project-web-2`). -- Some requests might still fail during the brief moment between when the old container is stopped and when the proxy stops sending traffic to it. In most cases, this isn't an issue, but you can fully prevent it by configuring [request draining](#true-zero-downtime-deployment-with-request-draining), which requires a slightly more complex setup. +- To avoid dropping currently processed requests when stopping the old container, you need to setup [request draining](#draining-old-containers), which requires a slightly more complex setup. ### Sample deployment script @@ -76,9 +81,9 @@ docker compose run --rm web rake db:migrate docker rollout web ``` -### True zero-downtime deployment with request draining +### Draining old containers -If you want to make sure that no requests are lost during deployment, you can use the following setup to implement request draining. It requires adding a healthcheck to your container that will be made failing on purpose when performing rollout to the make proxy (Traefik or nginx-proxy) stop sending requests to the old container before it's removed. +If you want to make sure that no requests are lost during deployment, you can use the following setup to implement request draining. It requires adding a healthcheck to your container that will be failing on purpose when performing rollout to make the proxy (Traefik or nginx-proxy) stop sending requests to the old container before it's removed. 1. Add additional healthcheck to your container. The check should fail when `/tmp/drain` file is present. @@ -89,12 +94,12 @@ If you want to make sure that no requests are lost during deployment, you can us web: image: myapp:latest healthcheck: - test: ["CMD", "test", "!", "-f", "/tmp/drain"] + test: test ! -f /tmp/drain interval: 5s retries: 1 ``` - If your service already has a healthcheck: + If your service already has a healthcheck (e.g. `curl -f http://localhost:3000/healthcheck`): ```yml services: @@ -106,12 +111,25 @@ If you want to make sure that no requests are lost during deployment, you can us retries: 1 ``` + 2. Use the following command to perform a zero-downtime deployment: ```bash docker rollout web --pre-stop-hook "touch /tmp/drain && sleep 10" ``` + or add the following label to your service in `docker-compose.yml`: + + ```yml + services: + web: + image: myapp:latest + labels: + docker-rollout.pre-stop-hook: "touch /tmp/drain && sleep 10" + ``` + + Remember that docker-rollout reads labels from the old container, so **this hook will be executed during the next deployment**. CLI options have higher priority than container labels, so you can use it to override the label value. + **Important:** make sure the sleep time is longer than the healthcheck `interval` × `retries` + `time to finish processing open requests` (e.g. interval: 10s, retries: 3, additional time of 5s = sleep 35) so the healthcheck has enough time to mark the container as unhealthy. With this configuration, a rollout process looks like this: @@ -124,7 +142,7 @@ With this configuration, a rollout process looks like this: 6. Proxy stops sending requests to the old container. 7. Old container is removed. -## Why? +## Why use docker-rollout? Using `docker compose up` to deploy a new version of your app causes downtime because the app container has to be stopped before the new container is created. If your application takes a while to boot, this may be noticeable to your users. @@ -137,4 +155,4 @@ If you're using Docker healthchecks, Traefik will make sure that traffic is only ## License -[MIT License](LICENSE) © Karol Musur +[MIT License](LICENSE) © [Karol Musur](https://wowu.dev) diff --git a/docker-rollout b/docker-rollout index 06efdd7..bce38b7 100755 --- a/docker-rollout +++ b/docker-rollout @@ -153,13 +153,20 @@ main() { sleep "$NO_HEALTHCHECK_TIMEOUT" fi + # Check if pre-stop hook is defined in first old container label + # shellcheck disable=SC2086 # DOCKER_ARGS must be unquoted to allow multiple arguments + PRE_STOP_HOOK=${PRE_STOP_HOOK:-$(docker $DOCKER_ARGS inspect --format='{{index .Config.Labels "docker-rollout.pre-stop-hook"}}' "${OLD_CONTAINER_IDS[0]}")} + if [ -n "$PRE_STOP_HOOK" ]; then echo "==> Running pre-stop hook: $PRE_STOP_HOOK" for OLD_CONTAINER_ID in $OLD_CONTAINER_IDS; do # shellcheck disable=SC2086 # DOCKER_ARGS must be unquoted to allow multiple arguments - docker $DOCKER_ARGS exec "$OLD_CONTAINER_ID" sh -c "$PRE_STOP_HOOK" + docker $DOCKER_ARGS exec "$OLD_CONTAINER_ID" sh -c "$PRE_STOP_HOOK" & done + + # Wait for all pre-stop hooks to finish + wait fi echo "==> Stopping and removing old containers" diff --git a/docs/cli-options.md b/docs/cli-options.md index 83094dc..3573675 100644 --- a/docs/cli-options.md +++ b/docs/cli-options.md @@ -12,7 +12,7 @@ nav_order: 3 ## Docker flags -All docker flags can be used with `docker rollout` normally, like `--context`, `--env`, `--log-level`, etc. +All docker flags can be used with `docker rollout` as usual, like `--context`, `--env`, `--log-level`, etc. ```bash docker --context my-remote-context rollout @@ -22,7 +22,7 @@ The plugin flags are described below. ## `-f | --file FILE` -Path to compose file, can be specified multiple times, as in `docker compose`. +Path to compose file, can be specified multiple times, like in `docker compose`. **Example** @@ -102,7 +102,9 @@ docker rollout --env-file .env --env-file .env.prod ## `--pre-stop-hook COMMAND` -Command to run in the old container before stopping it. Can be used for marking the container as unhealthy to make proxy stop sending requests to it, see [request draining](examples/request-draining.md) below. +Label: `docker-rollout.pre-stop-hook` + +Command to run in the old container before stopping it. Can be used for marking the container as unhealthy to make proxy stop sending requests to it, see [request draining](request-draining). **Example** diff --git a/docs/examples/request-draining.md b/docs/examples/request-draining.md index 70332a7..bf7de38 100644 --- a/docs/examples/request-draining.md +++ b/docs/examples/request-draining.md @@ -1,5 +1,5 @@ --- -title: Request Draining +title: Traefik w/ Request Draining parent: Examples --- diff --git a/docs/index.md b/docs/index.md index 98157b6..98fdac0 100644 --- a/docs/index.md +++ b/docs/index.md @@ -30,12 +30,10 @@ Using `docker compose up` to deploy a new version of a service causes downtime b - Your service cannot have `container_name` and `ports` defined in `docker-compose.yml`, as it's not possible to run multiple containers with the same name or port mapping. Use a proxy as described below. - Proxy like [Traefik](https://github.com/traefik/traefik) or [nginx-proxy](https://github.com/nginx-proxy/nginx-proxy) is required to route traffic to the containers. Refer to the [Examples](examples) for sample compose files. - Each deployment will increment the number in container name (e.g. `project-web-1` -> `project-web-2`). -- Some requests might still fail during the brief moment between when the old container is stopped and when the proxy stops sending traffic to it. In most cases, this isn't an issue, but you can fully prevent it by configuring request draining, which requires a slightly more complex setup. +- To avoid dropping currently processed requests when stopping the old container, you need to setup [request draining](#draining-old-containers), which requires a slightly more complex setup. ## Installation -Quick install: - ```bash # Create directory for Docker cli plugins mkdir -p ~/.docker/cli-plugins @@ -70,6 +68,12 @@ docker compose run web rake db:migrate docker rollout web ``` +### Draining old containers + +If you want to make sure that no requests are lost during deployment, you can use the following setup to implement request draining. It requires adding a healthcheck to your container that will be failing on purpose when performing rollout to make the proxy (Traefik or nginx-proxy) stop sending requests to the old container before it's removed. + +See [Request draining](request-draining). + ## Rationale and alternatives Using `docker compose up` to deploy a new version of a service causes downtime because the app container is stopped before the new container is created. @@ -83,5 +87,5 @@ If you're using Docker healthchecks, Traefik will make sure that traffic is only ## License -[MIT License](https://github.com/wowu/docker-rollout/blob/main/LICENSE) © Karol Musur +[MIT License](https://github.com/wowu/docker-rollout/blob/main/LICENSE) © [Karol Musur](https://wowu.dev) diff --git a/docs/request-draining.md b/docs/request-draining.md new file mode 100644 index 0000000..35dffd9 --- /dev/null +++ b/docs/request-draining.md @@ -0,0 +1,67 @@ +--- +title: Request Draining +nav_order: 4 +--- + +# True zero-downtime deployment with request draining + +If you want to make sure that no requests are lost during deployment, you can use the following setup to implement request draining. It requires adding a healthcheck to your container that will be failing on purpose when performing rollout to make the proxy (Traefik or nginx-proxy) stop sending requests to the old container before it's removed. + +1. Add additional healthcheck to your container. The check should fail when `/tmp/drain` file is present. + + If your service doesn't have a healthcheck yet: + + ```yml + services: + web: + image: myapp:latest + healthcheck: + test: test ! -f /tmp/drain + interval: 5s + retries: 1 + ``` + + If your service already has a healthcheck (e.g. `curl -f http://localhost:3000/healthcheck`): + + ```yml + services: + web: + image: myapp:latest + healthcheck: + test: test ! -f /tmp/drain && curl -f http://localhost:3000/healthcheck + interval: 5s + retries: 1 + ``` + + +2. Use the following command to perform a zero-downtime deployment: + + ```bash + docker rollout web --pre-stop-hook "touch /tmp/drain && sleep 10" + ``` + + or add the following label to your service in `docker-compose.yml`: + + ```yml + services: + web: + image: myapp:latest + labels: + docker-rollout.pre-stop-hook: "touch /tmp/drain && sleep 10" + ``` + + Remember that docker-rollout reads labels from the old container, so **this hook will be executed during the next deployment**. CLI options have higher priority than container labels, so you can use it to override the label value. + + **Important:** make sure the sleep time is longer than the healthcheck `interval` × `retries` + `time to finish processing open requests` (e.g. interval: 10s, retries: 3, additional time of 5s = sleep 35) so the healthcheck has enough time to mark the container as unhealthy. + +With this configuration, a rollout process looks like this: + +1. New container is started. +2. Docker daemon marks the old container as healthy. +3. Proxy starts sending requests to the new container alongside the old container. +4. We create `/tmp/drain` file in the old container. +5. Docker daemon marks the old container as unhealthy. +6. Proxy stops sending requests to the old container. +7. Old container is removed. + +See sample configuration for [Traefik](examples/request-draining.md). diff --git a/docs/uninstalling.md b/docs/uninstalling.md index 6b55865..9a740f3 100644 --- a/docs/uninstalling.md +++ b/docs/uninstalling.md @@ -1,6 +1,6 @@ --- title: Uninstalling -nav_order: 5 +nav_order: 6 --- # Uninstalling docker rollout diff --git a/docs/upgrading.md b/docs/upgrading.md index 2cf5351..4c2e379 100644 --- a/docs/upgrading.md +++ b/docs/upgrading.md @@ -1,6 +1,6 @@ --- title: Upgrading -nav_order: 4 +nav_order: 5 --- # Upgrading docker rollout From 3affa26ba0a07cfa5275cb8adc53e05392af9b9c Mon Sep 17 00:00:00 2001 From: Karol Musur Date: Thu, 29 May 2025 12:49:12 +0200 Subject: [PATCH 6/7] fix: shellcheck --- docker-rollout | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docker-rollout b/docker-rollout index bce38b7..f6ad77f 100755 --- a/docker-rollout +++ b/docker-rollout @@ -154,8 +154,9 @@ main() { fi # Check if pre-stop hook is defined in first old container label + FIRST_OLD_CONTAINER_ID=$(echo "$OLD_CONTAINER_IDS" | cut -d\ -f 1) # shellcheck disable=SC2086 # DOCKER_ARGS must be unquoted to allow multiple arguments - PRE_STOP_HOOK=${PRE_STOP_HOOK:-$(docker $DOCKER_ARGS inspect --format='{{index .Config.Labels "docker-rollout.pre-stop-hook"}}' "${OLD_CONTAINER_IDS[0]}")} + PRE_STOP_HOOK=${PRE_STOP_HOOK:-$(docker $DOCKER_ARGS inspect --format='{{index .Config.Labels "docker-rollout.pre-stop-hook"}}' "$FIRST_OLD_CONTAINER_ID")} if [ -n "$PRE_STOP_HOOK" ]; then echo "==> Running pre-stop hook: $PRE_STOP_HOOK" From 94336cfaed657c5ed1ff1d9181b13d8fdc14b337 Mon Sep 17 00:00:00 2001 From: Karol Musur Date: Thu, 29 May 2025 13:11:24 +0200 Subject: [PATCH 7/7] docs: update --- README.md | 24 +++++++------------ docs/cli-options.md | 4 ++-- ...uest-draining.md => container-draining.md} | 8 +++---- ...uest-draining.md => container-draining.md} | 4 ++-- docs/index.md | 6 ++--- 5 files changed, 19 insertions(+), 27 deletions(-) rename docs/{request-draining.md => container-draining.md} (79%) rename docs/examples/{request-draining.md => container-draining.md} (95%) diff --git a/README.md b/README.md index b962bce..ec9c3bd 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ Zero Downtime Deployment for Docker Compose

-[Docs](https://docker-rollout.wowu.dev) +[Documentation](https://docker-rollout.wowu.dev) @@ -17,7 +17,7 @@ Simply replace `docker compose up -d ` with `docker rollout ` - [Usage](#usage) - [⚠️ Caveats](#️-caveats) - [Sample deployment script](#sample-deployment-script) - - [True zero-downtime deployment with request draining](#true-zero-downtime-deployment-with-request-draining) + - [Draining old containers](#draining-old-containers) - [Why?](#why) - [License](#license) @@ -55,16 +55,16 @@ Options: - `-w | --wait SECONDS` - (not required) - Time to wait for new container to be ready if healthcheck is not defined. Default: 10 - `--wait-after-healthy SECONDS` - (not required) - Time to wait after new container is healthy before removing old container. Works when healthcheck is defined. Default: 0 - `--env-file FILE` - (not required) - Path to env file, can be specified multiple times, as in `docker compose`. -- `--pre-stop-hook` - (not required) - Command to run in the old container before stopping it. Can be used for marking the container as unhealthy to make proxy stop sending requests to it, see [request draining](#true-zero-downtime-deployment-with-request-draining) below. +- `--pre-stop-hook` - (not required) - Command to run in the old container before stopping it. Can be used for marking the container as unhealthy to make proxy stop sending requests to it, see [container draining](#draining-old-containers) below. -See [examples](https://docker-rollout.wowu.dev/examples/) in docs for sample `docker-compose.yml` files. +See [detailed options description](https://docker-rollout.wowu.dev/cli-options) and [compose.yml file examples](https://docker-rollout.wowu.dev/examples/) in docs. ### ⚠️ Caveats - Your service cannot have `container_name` and `ports` defined in `docker-compose.yml`, as it's not possible to run multiple containers with the same name or port mapping. Use a proxy as described below. - Proxy like [Traefik](https://github.com/traefik/traefik) or [nginx-proxy](https://github.com/nginx-proxy/nginx-proxy) is required to route traffic. - Each deployment will increment the number in container name (e.g. `project-web-1` -> `project-web-2`). -- To avoid dropping currently processed requests when stopping the old container, you need to setup [request draining](#draining-old-containers), which requires a slightly more complex setup. +- To avoid dropping currently processed requests when stopping the old container, you need to setup [container draining](#draining-old-containers), which requires a slightly more complex setup. ### Sample deployment script @@ -83,7 +83,7 @@ docker rollout web ### Draining old containers -If you want to make sure that no requests are lost during deployment, you can use the following setup to implement request draining. It requires adding a healthcheck to your container that will be failing on purpose when performing rollout to make the proxy (Traefik or nginx-proxy) stop sending requests to the old container before it's removed. +If you want to make sure that no requests are lost during deployment, you can use the following setup to implement container draining. It requires adding a healthcheck to your container that will be failing on purpose when performing rollout to make the proxy (Traefik or nginx-proxy) stop sending requests to the old container before it's removed. 1. Add additional healthcheck to your container. The check should fail when `/tmp/drain` file is present. @@ -128,19 +128,11 @@ If you want to make sure that no requests are lost during deployment, you can us docker-rollout.pre-stop-hook: "touch /tmp/drain && sleep 10" ``` - Remember that docker-rollout reads labels from the old container, so **this hook will be executed during the next deployment**. CLI options have higher priority than container labels, so you can use it to override the label value. + Remember that docker-rollout reads labels from the old container, so **this hook will work on the next deployment**. CLI options have higher priority than container labels, so you can use it to override the label value. **Important:** make sure the sleep time is longer than the healthcheck `interval` × `retries` + `time to finish processing open requests` (e.g. interval: 10s, retries: 3, additional time of 5s = sleep 35) so the healthcheck has enough time to mark the container as unhealthy. -With this configuration, a rollout process looks like this: - -1. New container is started. -2. Docker daemon marks the old container as healthy. -3. Proxy starts sending requests to the new container alongside the old container. -4. We create `/tmp/drain` file in the old container. -5. Docker daemon marks the old container as unhealthy. -6. Proxy stops sending requests to the old container. -7. Old container is removed. +Read more about [container draining in the docs](https://docker-rollout.wowu.dev/container-draining). ## Why use docker-rollout? diff --git a/docs/cli-options.md b/docs/cli-options.md index 3573675..cdf5528 100644 --- a/docs/cli-options.md +++ b/docs/cli-options.md @@ -18,7 +18,7 @@ All docker flags can be used with `docker rollout` as usual, like `--context`, ` docker --context my-remote-context rollout ``` -The plugin flags are described below. +The plugin flags are described below. Some of the options can be defined as container labels. ## `-f | --file FILE` @@ -104,7 +104,7 @@ docker rollout --env-file .env --env-file .env.prod Label: `docker-rollout.pre-stop-hook` -Command to run in the old container before stopping it. Can be used for marking the container as unhealthy to make proxy stop sending requests to it, see [request draining](request-draining). +Command to run in the old container before stopping it. Can be used for marking the container as unhealthy to gracefully finish running requests before deleting the container, see [container draining](container-draining). **Example** diff --git a/docs/request-draining.md b/docs/container-draining.md similarity index 79% rename from docs/request-draining.md rename to docs/container-draining.md index 35dffd9..bfe7d50 100644 --- a/docs/request-draining.md +++ b/docs/container-draining.md @@ -1,11 +1,11 @@ --- -title: Request Draining +title: Container Draining nav_order: 4 --- -# True zero-downtime deployment with request draining +# True zero-downtime deployment with container draining -If you want to make sure that no requests are lost during deployment, you can use the following setup to implement request draining. It requires adding a healthcheck to your container that will be failing on purpose when performing rollout to make the proxy (Traefik or nginx-proxy) stop sending requests to the old container before it's removed. +If you want to make sure that no requests are lost during deployment, you can use the following setup to implement container draining. It requires adding a healthcheck to your container that will be failing on purpose when performing rollout to make the proxy (Traefik or nginx-proxy) stop sending requests to the old container before it's removed. This allows the old container to finish processing any open requests before it is stopped. 1. Add additional healthcheck to your container. The check should fail when `/tmp/drain` file is present. @@ -64,4 +64,4 @@ With this configuration, a rollout process looks like this: 6. Proxy stops sending requests to the old container. 7. Old container is removed. -See sample configuration for [Traefik](examples/request-draining.md). +See sample configuration for [Traefik](examples/container-draining.md). diff --git a/docs/examples/request-draining.md b/docs/examples/container-draining.md similarity index 95% rename from docs/examples/request-draining.md rename to docs/examples/container-draining.md index bf7de38..383cdf1 100644 --- a/docs/examples/request-draining.md +++ b/docs/examples/container-draining.md @@ -1,9 +1,9 @@ --- -title: Traefik w/ Request Draining +title: Traefik w/ Container Draining parent: Examples --- -# Request Draining with Traefik +# Container Draining with Traefik Works with Docker Compose v2. diff --git a/docs/index.md b/docs/index.md index 98fdac0..612c56f 100644 --- a/docs/index.md +++ b/docs/index.md @@ -30,7 +30,7 @@ Using `docker compose up` to deploy a new version of a service causes downtime b - Your service cannot have `container_name` and `ports` defined in `docker-compose.yml`, as it's not possible to run multiple containers with the same name or port mapping. Use a proxy as described below. - Proxy like [Traefik](https://github.com/traefik/traefik) or [nginx-proxy](https://github.com/nginx-proxy/nginx-proxy) is required to route traffic to the containers. Refer to the [Examples](examples) for sample compose files. - Each deployment will increment the number in container name (e.g. `project-web-1` -> `project-web-2`). -- To avoid dropping currently processed requests when stopping the old container, you need to setup [request draining](#draining-old-containers), which requires a slightly more complex setup. +- To avoid dropping currently processed requests when stopping the old container, you need to setup [container draining](#draining-old-containers), which requires a slightly more complex setup. ## Installation @@ -70,9 +70,9 @@ docker rollout web ### Draining old containers -If you want to make sure that no requests are lost during deployment, you can use the following setup to implement request draining. It requires adding a healthcheck to your container that will be failing on purpose when performing rollout to make the proxy (Traefik or nginx-proxy) stop sending requests to the old container before it's removed. +If you want to make sure that no requests are lost during deployment, you can use the following setup to implement container draining. It requires adding a healthcheck to your container that will be failing on purpose when performing rollout to make the proxy (Traefik or nginx-proxy) stop sending requests to the old container before it's removed. -See [Request draining](request-draining). +See [container draining](container-draining). ## Rationale and alternatives