Improve Deployment Workflows #4095

cemreinanc · 2026-01-17T21:36:53Z

closes #4094

Production (deploy_production.yml)

Add migration detection to toggle heroku preboot feature automatically:
- Disable preboot only when migrations exist (ensures migrations run before new code)
- Enable preboot when no migrations (zero-downtime deployment)
Add GitHub Actions cache for Docker builds

Staging (deploy_staging.yml)

Unify into single workflow with environment selector input
Add GitHub Actions cache for Docker builds

Benefits

Zero-downtime production deploys when no migrations exist
Faster builds via GHA caching

Summary by CodeRabbit

Chores
- Revamped production and staging deployment pipelines with a reusable Docker build, improved tagging/caching, and safer default handling.
- Unified staging workflow for dev/play targets and automated Storybook deployment trigger.
- Simplified preview pipeline naming to use a single preview ID and externalized image build.
New Features
- Migration-aware deploys that toggle preboot to reduce downtime.
- Added release-wait polling for releases.
Reliability
- Resilient CDN cache purge and sourcemap upload (non-fatal on failure).

_{✏️ Tip: You can customize this high-level summary in your review settings.}

- Introduced `deploy_production.yml` with enhanced features including zero-downtime deploy and improved caching. - Created `deploy_staging.yml` to unify deployment for dev and play environments, also featuring better caching.

- Changed the method of detecting migration file changes by first fetching all changed files and then filtering for migration files using grep

coderabbitai · 2026-01-17T21:37:07Z

📝 Walkthrough

Walkthrough

Adds reusable Docker build workflow and new production and unified staging deploy workflows. Production deploy detects migrations to toggle Heroku preboot, builds via shared workflow, performs Heroku container releases, waits for completion, does best-effort Cloudflare purge and Sentry sourcemap upload, and triggers Storybook. New helper scripts support checks and polling.

Changes

Cohort / File(s)	Summary
Reusable Docker build `/.github/workflows/docker_build.yml`	New reusable workflow: computes branch_slug/metadata, generates tags & labels (including optional latest), applies branch-scoped cache strategy, builds & pushes images to GHCR, optionally builds/pushes Heroku targets, and exposes outputs like `image_url`, `image_tags`, `primary_tag`, `heroku_targets`.
Production Deployment `/.github/workflows/deploy_production.yml`, `scripts/check_migrations_since_deploy.sh`, `scripts/wait_for_heroku_release.sh`	New production workflow: sparse checkout → migration detection (sets `has_migrations`) → invokes `docker_build` → installs Heroku CLI → toggles preboot based on migrations → releases container targets using `image_url` → polls release via `wait_for_heroku_release.sh` → best-effort Cloudflare purge & Sentry sourcemap upload → always-reset preboot → triggers Storybook.
Staging Deployment (Unified) `/.github/workflows/deploy_staging.yml`	New unified staging workflow for dev/play: uses `docker_build` with environment parameterization, logs into registries, releases Heroku container targets, and waits for release completion; replaces previous dev/play workflows.
PR preview adjustments `/.github/workflows/pr_preview.yml`	PR preview updated: inline branch-slug computation, removed old image outputs, replaced inline Docker steps with reusable `docker_build` usage, and switched downstream references to `needs.build-image.outputs.image_url`.
Removed / Replaced workflows `/.github/workflows/deploy_dev.yml`, `/.github/workflows/deploy_play.yml`, `/.github/workflows/deploy_prod.yml`	Old per-environment deploy workflows consolidated/replaced by the new `deploy_staging.yml` and `deploy_production.yml`.

Sequence Diagram(s)

sequenceDiagram
    participant GHA as GitHub Actions
    participant GHCR as GHCR (Registry)
    participant Heroku as Heroku
    participant Cloudflare as Cloudflare
    participant Sentry as Sentry

    GHA->>GHA: Checkout repo & run `check_migrations_since_deploy.sh`
    GHA->>GHCR: Invoke reusable `docker_build` → build & push images (emit `image_url`)
    alt has_migrations == true
        GHA->>Heroku: Disable preboot for app
    else
        GHA->>Heroku: Ensure preboot enabled
    end
    GHA->>Heroku: Login registry + release container targets using `image_url`
    loop poll release status
        Heroku-->>GHA: release status (pending/succeeded/failed)
    end
    alt release succeeded
        GHA->>Cloudflare: Purge zone cache (best-effort)
        GHA->>Sentry: Upload sourcemaps via Docker (best-effort)
        GHA->>GHA: Trigger Storybook workflow
        GHA->>Heroku: Reset preboot to safe state (always)
    else release failed
        GHA-->>GHA: Fail workflow
    end

sequenceDiagram
    participant GHA as GitHub Actions
    participant GHCR as GHCR (Registry)
    participant Heroku as Heroku

    GHA->>GHCR: Invoke `docker_build` (tags, caching, push)
    GHA->>Heroku: Authenticate (registry + CLI)
    GHA->>Heroku: Release staging app containers (targets from build)
    loop poll release status
        Heroku-->>GHA: release status updates
    end
    GHA->>GHA: Mark success/failure

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Implement Preview Deployments #4101 — Related changes to .github/workflows/pr_preview.yml switching the preview workflow to the shared docker_build.yml and updating image outputs.

Suggested reviewers

ncarazon
hlbmtc

Poem

🐇 I hopped through branches, sniffed the migration trail,
If tables rumble, I whisper preboot pale.
Else I nudge containers, let the images fly,
I purge the clouds and map sourcemaps high.
Tiny paws, big deploys — I cheer, oh my!

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	Title 'Improve Deployment Workflows' is generic and vague, failing to specify which workflows are improved or the key changes like migration detection and staging unification.	Consider a more specific title like 'Add migration detection for zero-downtime deploys and unify staging workflows' to better reflect the main improvements.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check	✅ Passed	All primary objectives from issue `#4094` are met: migration detection with preboot toggling [`#4094`], unified staging workflow [`#4094`], and GitHub Actions Docker caching [`#4094`] are implemented.
Out of Scope Changes check	✅ Passed	All changes align with stated objectives: new workflows for production and staging, migration detection scripts, helper scripts for Heroku release polling, and workflow refactoring for PR previews to use centralized Docker builds.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

.github/workflows/deploy_production.yml

…in permissions Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In @.github/workflows/deploy_production.yml:
- Around line 168-173: The Cloudflare purge step ("Cloudflare: Purge cache")
currently fires a curl POST but ignores the response; update the run block to
capture and validate the response: store the curl output and HTTP status, print
the full response for debugging, parse the JSON and check the HTTP status and
the JSON "success" field, and exit non‑zero (fail the workflow) if the purge did
not succeed so failures are visible in the workflow logs.
- Around line 42-44: The deployment selector stored in LAST_DEPLOYED currently
picks the first deployment for prod_env regardless of outcome; update the gh api
logic so it only considers deployments whose latest status is "success" by: list
deployments for the repo (same gh api call in the LAST_DEPLOYED assignment), for
each deployment fetch its statuses
(repos/:owner/:repo/deployments/:deployment_id/statuses) and ensure the most
recent status.state == "success", then pick the first such deployment and use
its .sha; adjust the command referenced by LAST_DEPLOYED to perform this
filtering (or replace the single jq filter with a two-step query that checks
deployment statuses) so only successful prod_env deployments are considered.

🧹 Nitpick comments (4)

.github/workflows/deploy_production.yml (4)
53-59: Git fetch strategy may silently fail to retrieve required history.

The --shallow-exclude option requires the specified commit to exist in the remote history. If $LAST_DEPLOYED is old or pruned, both fetch commands may fail silently (due to || true), and the subsequent git diff will then trigger safe_default. While the fallback is safe, consider adding a warning when fetch fails to aid debugging.
Proposed improvement for better debugging
          # Fetch only the history needed to compare with last deployment
          # This is faster than fetching full history with fetch-depth: 0
-         git fetch --shallow-exclude="$LAST_DEPLOYED" origin ${{ github.ref_name }} 2>/dev/null || true
-         git fetch --deepen=1 origin ${{ github.ref_name }} 2>/dev/null || true
+         git fetch --shallow-exclude="$LAST_DEPLOYED" origin ${{ github.ref_name }} 2>/dev/null || echo "::debug::shallow-exclude fetch failed, trying deepen"
+         git fetch --deepen=1 origin ${{ github.ref_name }} 2>/dev/null || echo "::debug::deepen fetch also failed"
91-92: Consider pinning Heroku CLI version for reproducibility.

Piping curl to sh is the officially recommended installation method, but it always installs the latest version. For reproducible builds, consider pinning to a specific version or using a pre-built action.
Alternative: Use a community action with version pinning
      - name: Install Heroku CLI
        uses: akhileshns/heroku-deploy@v3.13.15
        with:
          heroku_api_key: ${{ secrets.HEROKU_API_KEY }}
          heroku_app_name: ""  # Just to get CLI installed
          justlogin: true
Or use npm for version control:
      - name: Install Heroku CLI
        run: npm install -g heroku@10.0.0
99-101: Consider implications of latest tag for production.

Using latest tag in production can complicate rollback scenarios since it's overwritten on each deploy. The SHA-prefixed tag provides traceability, which is good. Verify that your rollback strategy uses the SHA-based tags rather than latest.

175-177: Add failure handling for Sentry sourcemaps upload.

If the Sentry upload fails, the workflow will fail at this step. Consider whether a failed sourcemap upload should block the deployment or just warn. If it should not block, add || true or continue-on-error: true.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 915dc8a and 8f2e63b.

📒 Files selected for processing (1)

.github/workflows/deploy_production.yml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: integration-tests

🔇 Additional comments (6)

.github/workflows/deploy_production.yml (6)

17-20: Permissions appropriately scoped.

The explicit permissions block addresses the previously flagged code scanning alert. The permissions follow the principle of least privilege with only the required scopes (contents: read, packages: write, deployments: write).

115-126: Efficient cache reuse from prior build step.

The Heroku target builds correctly reuse the GHA cache populated by the all_runners build. Since the cache is already populated with mode=max, not writing cache again (--cache-to) avoids redundant cache operations.

128-136: Preboot toggle logic is sound.

The conditional correctly disables preboot when migrations are detected (ensuring migrations complete before serving new code) and enables it for zero-downtime deployments when no migrations exist.

142-166: Release wait loop handles edge cases correctly.

The loop properly waits for pending releases and validates both succeeded status and current=true before exiting successfully. The 7.5-minute timeout (90 × 5s) is reasonable for Heroku releases.

179-185: Storybook job permissions are correctly scoped.

The job has explicit contents: read permission and uses secrets: inherit to pass required secrets to the reusable workflow. This addresses the previous code scanning alert about missing permissions.

29-29: No changes needed — actions/checkout@v6 is stable and current.

v6.0.1 was released on December 2, 2025, and is the latest stable version. Using v6 is appropriate.

Likely an incorrect or invalid review comment.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

.github/workflows/deploy_production.yml

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @.github/workflows/deploy_production.yml:
- Around line 43-60: The GraphQL query currently inspects deployment statuses
via statuses(first: 1) which doesn't guarantee the latest status; update the
query to use the Deployment.latestStatus field instead of statuses, i.e., in the
query block that requests repository { deployments { nodes { commitOid
statuses(...) } } } replace the statuses selection with latestStatus { state }
and adjust the jq selector to check .latestStatus.state when selecting a
successful deployment commitOid so the script reliably picks the latest
successful deployment.

♻️ Duplicate comments (1)

.github/workflows/deploy_production.yml (1)

184-189: Cloudflare purge response still unchecked.
This is the same concern flagged earlier: the purge call doesn’t validate success, so failures can be silent.

🧹 Nitpick comments (1)

.github/workflows/deploy_production.yml (1)

131-142: Persist build cache for Heroku targets.
--cache-from without --cache-to means target‑specific layers won’t be saved, so cache benefit is limited. Add a cache‑to (optionally with a scope) to keep it warm across runs.

Proposed change

-            docker buildx build \
+            docker buildx build \
               --platform linux/amd64 \
               --target "$target" \
               --tag "registry.heroku.com/${{ env.HEROKU_APP }}/$target" \
               --push \
-              --cache-from type=gha \
+              --cache-from type=gha,scope=heroku-targets \
+              --cache-to type=gha,mode=max,scope=heroku-targets \
               .

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f2e63b and f77729b.

📒 Files selected for processing (1)

.github/workflows/deploy_production.yml

🔇 Additional comments (1)

.github/workflows/deploy_production.yml (1)

28-29: No action needed. actions/checkout@v6 is the current stable major version released by GitHub Actions and is properly used in the workflow.

Likely an incorrect or invalid review comment.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

.github/workflows/deploy_production.yml

…GraphQL query

elisescu

Left some inline comments

elisescu · 2026-01-20T09:47:32Z

.github/workflows/deploy_production.yml

+          git fetch --shallow-exclude="$LAST_DEPLOYED" origin ${{ github.ref_name }} 2>/dev/null || true
+          git fetch --deepen=1 origin ${{ github.ref_name }} 2>/dev/null || true


Any reason for choosing not to fail if the fetch commands fail here? Or why the || true?

if they fail, we dont want to abort whole workflow, instead we rely on the git diff below to catch missing history and trigger safe_default

.github/workflows/deploy_production.yml

…ling - Replaced `deploy_prod.yml` and `deploy_dev.yml` with a unified `deploy_production.yml` and `deploy_staging.yml` that utilize a shared Docker build workflow for better caching and consistency. - Introduced migration detection in the production workflow to toggle preboot settings based on database migrations. - Updated the PR preview workflow to streamline Docker image building and deployment processes. - Enhanced Cloudflare cache purge logic to capture and log HTTP response status and errors.

.github/workflows/docker_build.yml

…in permissions Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In @.github/workflows/docker_build.yml:
- Around line 138-154: The inline script uses user-controlled github.head_ref
directly which risks shell injection; set the branch value via an action env var
and then read that safe env var inside the script (e.g., add env: BRANCH_INPUT:
${{ github.head_ref || github.ref_name }} on the step with id "vars"), then in
the run block use BRANCH="$BRANCH_INPUT" (or BRANCH="$(printf '%s'
"$BRANCH_INPUT")") and continue computing BRANCH_SLUG from BRANCH; keep
IMAGE_NAME/IMAGE_NAME_LOWER handling the same but ensure all expansions are
quoted to prevent word-splitting.

In @.github/workflows/pr_preview.yml:
- Around line 127-142: The step currently injects github.head_ref directly into
the inline script (used via BRANCH_NAME) which is a script-injection risk;
change the step to pass the branch ref through a GitHub Actions environment
variable (e.g., set an env: BRANCH_REF: "${{ github.head_ref || github.ref_name
}}" on the step) and then use that env var inside the script to compute
BRANCH_NAME, BRANCH_SLUG and PREVIEW_ID; keep the same slugification logic but
ensure you reference BRANCH_REF (not github.head_ref) when creating BRANCH_NAME
and when writing outputs like fly_app, neon_branch and redis_name.

🧹 Nitpick comments (6)

scripts/wait_for_heroku_release.sh (1)
39-57: Handle edge cases for empty releases or null status.

If the Heroku app has no releases yet, or if the JSON parsing returns null or empty values, the script may produce unexpected behavior. The status variable could be null or empty, causing neither the pending check nor the succeeded check to match, which would incorrectly report a failure.
Proposed defensive handling
 for ((i = 1; i <= MAX_ATTEMPTS; i++)); do
   json=$(heroku releases --json -a "$APP_NAME")
   status=$(echo "$json" | jq -r '.[0].status')
   current=$(echo "$json" | jq -r '.[0].current')
 
+  if [ -z "$status" ] || [ "$status" == "null" ]; then
+    echo "Warning: Could not determine release status (attempt $i/$MAX_ATTEMPTS)..."
+    sleep "$POLL_INTERVAL"
+    continue
+  fi
+
   if [ "$status" == "pending" ]; then
     echo "Waiting for release to finish (attempt $i/$MAX_ATTEMPTS)..."
     sleep "$POLL_INTERVAL"
     continue
   fi
scripts/check_migrations_since_deploy.sh (1)
57-73: Consider increasing deployment fetch limit or documenting the limitation.

The GraphQL query fetches only the first 20 deployments. If there have been more than 20 consecutive failed or non-success deployments, no successful deployment will be found, triggering the safe default. While this is unlikely in normal operations, it's worth documenting or increasing the limit.
Proposed adjustment
-      deployments(environments: [$env], first: 20, orderBy: {field: CREATED_AT, direction: DESC}) {
+      deployments(environments: [$env], first: 50, orderBy: {field: CREATED_AT, direction: DESC}) {
.github/workflows/docker_build.yml (2)
174-186: Quote variables in shell comparisons per shellcheck SC2086.

The shell variables in the conditional comparisons should be quoted to prevent word splitting and globbing issues.
Proposed fix
       - name: Compute cache scope
         id: cache
         run: |
           BRANCH_SLUG="${{ steps.vars.outputs.branch_slug }}"
           
           # Determine if this is the main branch (writes to main cache)
-          if [ "${{ github.ref_name }}" = "main" ] || [ "${{ github.ref_name }}" = "master" ]; then
+          REF_NAME="${{ github.ref_name }}"
+          if [ "$REF_NAME" = "main" ] || [ "$REF_NAME" = "master" ]; then
             echo "cache_write_tag=buildcache-main" >> $GITHUB_OUTPUT
           else
-            echo "cache_write_tag=buildcache-${BRANCH_SLUG}" >> $GITHUB_OUTPUT
+            echo "cache_write_tag=buildcache-${BRANCH_SLUG}" >> "$GITHUB_OUTPUT"
           fi
           
-          echo "cache_scope=${BRANCH_SLUG}" >> $GITHUB_OUTPUT
+          echo "cache_scope=${BRANCH_SLUG}" >> "$GITHUB_OUTPUT"
205-222: Quote the Heroku targets variable in the for loop.

The heroku_targets input is unquoted in the for loop. While unlikely to cause issues with the default value, quoting would be safer if custom targets with special characters were ever used.
Proposed fix
       - name: Build and push Heroku targets
         if: inputs.heroku_app != ''
+        env:
+          HEROKU_TARGETS: ${{ inputs.heroku_targets }}
         run: |
           set -e
           IMAGE_NAME="${{ steps.vars.outputs.image_name }}"
           CACHE_SCOPE="${{ steps.cache.outputs.cache_scope }}"
           
-          for target in ${{ inputs.heroku_targets }}; do
+          for target in $HEROKU_TARGETS; do
             echo "Building Heroku target: $target"
.github/workflows/deploy_staging.yml (1)

57-58: Consider making the wait timeout configurable per environment.

Production uses 90 attempts while staging uses 60. If staging environments have different release characteristics, this is fine. Otherwise, consider using a consistent timeout or making it configurable.
.github/workflows/deploy_production.yml (1)
117-119: Consider splitting the Sentry upload command for readability.

The single-line Docker command is quite long. Breaking it into multiple lines would improve readability.
Proposed formatting
       - name: "Sentry: Upload sourcemaps"
         run: |
-          docker run -w /app/front_end -e SENTRY_AUTH_TOKEN=${{ secrets.SENTRY_AUTH_TOKEN }} ${{ needs.build.outputs.image_url }} npx sentry-cli sourcemaps upload /app/front_end/.next/ --project metaculus-frontend --org metaculus
+          docker run \
+            -w /app/front_end \
+            -e SENTRY_AUTH_TOKEN=${{ secrets.SENTRY_AUTH_TOKEN }} \
+            ${{ needs.build.outputs.image_url }} \
+            npx sentry-cli sourcemaps upload /app/front_end/.next/ \
+              --project metaculus-frontend \
+              --org metaculus

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2c89da4 and 8d85bc1.

📒 Files selected for processing (6)

.github/workflows/deploy_production.yml
.github/workflows/deploy_staging.yml
.github/workflows/docker_build.yml
.github/workflows/pr_preview.yml
scripts/check_migrations_since_deploy.sh
scripts/wait_for_heroku_release.sh

🧰 Additional context used

🪛 actionlint (1.7.10)

.github/workflows/docker_build.yml

140-140: shellcheck reported issue in this script: SC2086:info:12:38: Double quote to prevent globbing and word splitting

(shellcheck)

140-140: shellcheck reported issue in this script: SC2086:info:4:42: Double quote to prevent globbing and word splitting

(shellcheck)

140-140: "github.head_ref" is potentially untrusted. avoid using it directly in inline scripts. instead, pass it through an environment variable. see https://docs.github.com/en/actions/reference/security/secure-use#good-practices-for-mitigating-script-injection-attacks for more details

(expression)

176-176: shellcheck reported issue in this script: SC2086:info:10:38: Double quote to prevent globbing and word splitting

(shellcheck)

176-176: shellcheck reported issue in this script: SC2086:info:5:45: Double quote to prevent globbing and word splitting

(shellcheck)

176-176: shellcheck reported issue in this script: SC2086:info:7:55: Double quote to prevent globbing and word splitting

(shellcheck)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build Docker Image / Build Docker Image

🔇 Additional comments (6)

scripts/check_migrations_since_deploy.sh (1)

1-109: Well-designed migration detection with safe fallbacks.

The script follows a fail-safe approach by defaulting to has_migrations=true (disabling preboot) when any uncertainty exists. The use of latestStatus.state addresses the previous review feedback about reliable status ordering. The separation into a standalone script improves testability as requested.

.github/workflows/pr_preview.yml (1)

147-154: Good consolidation to shared Docker build workflow.

The migration to the reusable docker_build.yml workflow reduces code duplication and ensures consistent build configuration across all deployment targets.

.github/workflows/deploy_staging.yml (1)

26-58: Clean unification of staging workflows.

The workflow successfully consolidates dev and play deployments into a single parameterized workflow, reducing maintenance overhead. The sparse checkout for the scripts directory is an efficient optimization.

.github/workflows/deploy_production.yml (3)

62-70: Preboot toggle logic is correct.

The migration-based preboot configuration correctly disables preboot when migrations exist (ensuring migrations run before new code) and enables it for zero-downtime deployments when no migrations are present. This aligns with the PR objectives.

85-115: Robust Cloudflare cache purge implementation.

The error handling captures HTTP status and validates the JSON response, addressing the previous review feedback. Using exit 0 with ::warning:: appropriately allows the deployment to succeed while alerting on cache purge issues.

121-125: Good safety mechanism to reset preboot state.

Using always() ensures preboot is disabled even if previous steps fail, maintaining a safe default state for legacy deployment scripts as mentioned in the past review discussion.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

.github/workflows/docker_build.yml

.github/workflows/pr_preview.yml

github-actions · 2026-01-20T23:23:54Z

🚀 Preview Environment

Your preview environment is ready!

Resource	Details
🌐 Preview URL	https://metaculus-pr-4095-chore-deploy-actions-improveme.fly.dev
📦 Docker Image	`ghcr.io/metaculus/metaculus:chore-deploy-actions-improvements-d8f75f2`
🗄️ PostgreSQL	NeonDB branch `preview/pr-4095-chore-deploy-actions-improveme`
⚡ Redis	Fly Redis `mtc-redis-pr-4095-chore-deploy-actions-improveme`

Details

Commit: d8f75f28c8b24f15f04a0f2352555d37b22bd97a
Branch: chore/deploy-actions-improvements
Fly App: metaculus-pr-4095-chore-deploy-actions-improveme

ℹ️ Preview Environment Info

Isolation:

PostgreSQL and Redis are fully isolated from production
Each PR gets its own database branch and Redis instance
Changes pushed to this PR will trigger a new deployment

Limitations:

Background workers and cron jobs are not deployed in preview environments
If you need to test background jobs, use Heroku staging environments

Cleanup:

This preview will be automatically destroyed when the PR is closed

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @.github/workflows/deploy_production.yml:
- Around line 100-108: The deploy job's "Sentry: Upload sourcemaps" step runs
docker run which pulls the GHCR image but lacks authentication; add a new step
that uses docker/login-action@v3 to perform docker login to ghcr.io (using
github.actor and secrets.GITHUB_TOKEN) immediately before the "Sentry: Upload
sourcemaps" step so the subsequent docker run can pull the private image
successfully.

🧹 Nitpick comments (2)

.github/workflows/deploy_production.yml (2)
1-6: Track the TODO for retiring the legacy workflow.
Consider creating a small tracking issue so this doesn’t linger indefinitely.

28-36: Confirm migration check works with a shallow checkout.
If the script relies on git diff between SHAs, the default depth of 1 can make the comparison incomplete. If that’s the case, fetch full history (or the base SHA) here.
🛠️ Suggested tweak (if history is required)
       - name: Checkout code
         uses: actions/checkout@v6
+        with:
+          fetch-depth: 0

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 71ee2e4 and ca839d0.

📒 Files selected for processing (1)

.github/workflows/deploy_production.yml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Build Docker Image / Build Docker Image
GitHub Check: integration-tests
GitHub Check: Frontend Checks
GitHub Check: Backend Checks
GitHub Check: Analyze (javascript-typescript)

🔇 Additional comments (8)

.github/workflows/deploy_production.yml (8)

13-16: Scoped permissions look good.

41-47: Reusable build workflow integration looks good.

62-70: Preboot toggle logic matches migration detection.

78-83: Deploy + release wait sequencing looks solid.

85-98: Cloudflare purge step is clear and safely non-fatal.

110-114: Preboot reset safeguard is good.

116-122: Storybook deployment trigger looks good.

25-27: actions/checkout@v6 is a valid released major version. v6.0.2 was released on Jan 9, 2026, and is the latest major release. Both checkouts at lines 25-27 and 72-74 are correctly pinned and will not fail due to version unavailability. No changes required.

Likely an incorrect or invalid review comment.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-21T11:36:51Z

.github/workflows/deploy_production.yml

+      - name: "Sentry: Upload sourcemaps"
+        run: |
+          docker run \
+            -w /app/front_end \
+            -e SENTRY_AUTH_TOKEN=${{ secrets.SENTRY_AUTH_TOKEN }} \
+            ${{ needs.build.outputs.image_url }} \
+            npx sentry-cli sourcemaps upload /app/front_end/.next/ \
+              --project metaculus-frontend \
+              --org metaculus


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd .github/workflows && wc -l deploy_production.yml

Repository: Metaculus/metaculus

Length of output: 88

🏁 Script executed:

cat -n .github/workflows/deploy_production.yml

Repository: Metaculus/metaculus

Length of output: 4970

🏁 Script executed:

cat -n .github/workflows/docker_build.yml

Repository: Metaculus/metaculus

Length of output: 10285

🏁 Script executed:

# Check if there's any docker login in the deploy job or elsewhere grep -n "docker login\|docker/login-action" .github/workflows/deploy_production.yml

Repository: Metaculus/metaculus

Length of output: 45

🏁 Script executed:

# Check if the repository is public or private git ls-files | head -20

Repository: Metaculus/metaculus

Length of output: 686

Add docker login to GHCR before docker run in the deploy job.

The deploy job runs on a fresh runner without inherited authentication from the build job. When docker run attempts to pull the GHCR image (a private registry for this repository), it will fail without prior docker login. Add a docker/login-action step before the Sentry upload step:

- name: Login to GitHub Container Registry uses: docker/login-action@v3 with: registry: ghcr.io username: ${{ github.actor }} password: ${{ secrets.GITHUB_TOKEN }}

🤖 Prompt for AI Agents

In @.github/workflows/deploy_production.yml around lines 100 - 108, The deploy job's "Sentry: Upload sourcemaps" step runs docker run which pulls the GHCR image but lacks authentication; add a new step that uses docker/login-action@v3 to perform docker login to ghcr.io (using github.actor and secrets.GITHUB_TOKEN) immediately before the "Sentry: Upload sourcemaps" step so the subsequent docker run can pull the private image successfully.

cemreinanc added 2 commits January 17, 2026 21:19

Implement new production and staging deployment workflows

e10ff29

- Introduced `deploy_production.yml` with enhanced features including zero-downtime deploy and improved caching. - Created `deploy_staging.yml` to unify deployment for dev and play environments, also featuring better caching.

fix migration file detection

915dc8a

- Changed the method of detecting migration file changes by first fetching all changed files and then filtering for migration files using grep

cemreinanc had a problem deploying to testing_env January 17, 2026 21:36 — with GitHub Actions Error

cemreinanc temporarily deployed to testing_env January 17, 2026 21:36 — with GitHub Actions Inactive

github-advanced-security bot found potential problems Jan 17, 2026

View reviewed changes

.github/workflows/deploy_production.yml Fixed Show fixed Hide fixed

Potential fix for code scanning alert no. 15: Workflow does not conta…

8f2e63b

…in permissions Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

cemreinanc temporarily deployed to testing_env January 17, 2026 21:41 — with GitHub Actions Inactive

coderabbitai bot reviewed Jan 17, 2026

View reviewed changes

.github/workflows/deploy_production.yml Outdated Show resolved Hide resolved

.github/workflows/deploy_production.yml Outdated Show resolved Hide resolved

cemreinanc marked this pull request as draft January 17, 2026 22:02

fetch last successful commit using GraphQL

f77729b

cemreinanc temporarily deployed to testing_env January 17, 2026 22:21 — with GitHub Actions Inactive

cemreinanc marked this pull request as ready for review January 19, 2026 00:19

coderabbitai bot reviewed Jan 19, 2026

View reviewed changes

.github/workflows/deploy_production.yml Outdated Show resolved Hide resolved

fetch the last successful deployment commit by latestStatus in the …

2c89da4

…GraphQL query

cemreinanc temporarily deployed to testing_env January 19, 2026 01:35 — with GitHub Actions Inactive

cemreinanc requested review from elisescu, hlbmtc and ncarazon January 19, 2026 01:41

cemreinanc mentioned this pull request Jan 20, 2026

Implement Preview Deployments #4101

Merged

elisescu reviewed Jan 20, 2026

View reviewed changes

.github/workflows/deploy_production.yml Show resolved Hide resolved

cemreinanc added 2 commits January 20, 2026 15:31

Merge branch 'main' into chore/deploy-actions-improvements

ef16901

cemreinanc had a problem deploying to testing_env January 20, 2026 22:59 — with GitHub Actions Error

cemreinanc temporarily deployed to testing_env January 20, 2026 22:59 — with GitHub Actions Inactive

github-advanced-security bot found potential problems Jan 20, 2026

View reviewed changes

.github/workflows/docker_build.yml Fixed Show fixed Hide fixed

Potential fix for code scanning alert no. 16: Workflow does not conta…

8d85bc1

…in permissions Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

cemreinanc temporarily deployed to testing_env January 20, 2026 23:03 — with GitHub Actions Inactive

cemreinanc temporarily deployed to Preview January 20, 2026 23:19 — with GitHub Actions Inactive

coderabbitai bot reviewed Jan 20, 2026

View reviewed changes

.github/workflows/docker_build.yml Show resolved Hide resolved

.github/workflows/pr_preview.yml Show resolved Hide resolved

implement security suggestion

71ee2e4

cemreinanc temporarily deployed to testing_env January 20, 2026 23:37 — with GitHub Actions Inactive

cemreinanc temporarily deployed to Preview January 20, 2026 23:44 — with GitHub Actions Inactive

simplify

ca839d0

cemreinanc temporarily deployed to testing_env January 21, 2026 11:26 — with GitHub Actions Inactive

cemreinanc deployed to Preview January 21, 2026 11:33 — with GitHub Actions View deployment

coderabbitai bot reviewed Jan 21, 2026

View reviewed changes

		git fetch --shallow-exclude="$LAST_DEPLOYED" origin ${{ github.ref_name }} 2>/dev/null \|\| true
		git fetch --deepen=1 origin ${{ github.ref_name }} 2>/dev/null \|\| true

Improve Deployment Workflows #4095

Are you sure you want to change the base?

Improve Deployment Workflows #4095

Conversation

cemreinanc commented Jan 17, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benefits

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elisescu left a comment

Choose a reason for hiding this comment

Uh oh!

elisescu Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

cemreinanc Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🚀 Preview Environment

Details

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cemreinanc commented Jan 17, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 17, 2026 •

edited

Loading

cemreinanc Jan 20, 2026 •

edited

Loading

github-actions bot commented Jan 20, 2026 •

edited

Loading