Skip to content

Conversation

@cemreinanc
Copy link
Contributor

@cemreinanc cemreinanc commented Jan 17, 2026

closes #4094

Production (deploy_production.yml)

  • Add migration detection to toggle heroku preboot feature automatically:
    • Disable preboot only when migrations exist (ensures migrations run before new code)
    • Enable preboot when no migrations (zero-downtime deployment)
  • Add GitHub Actions cache for Docker builds

Staging (deploy_staging.yml)

  • Unify into single workflow with environment selector input
  • Add GitHub Actions cache for Docker builds

Benefits

  • Zero-downtime production deploys when no migrations exist
  • Faster builds via GHA caching

Summary by CodeRabbit

  • Chores

    • Revamped production and staging deployment pipelines with a reusable Docker build, improved tagging/caching, and safer default handling.
    • Unified staging workflow for dev/play targets and automated Storybook deployment trigger.
    • Simplified preview pipeline naming to use a single preview ID and externalized image build.
  • New Features

    • Migration-aware deploys that toggle preboot to reduce downtime.
    • Added release-wait polling for releases.
  • Reliability

    • Resilient CDN cache purge and sourcemap upload (non-fatal on failure).

✏️ Tip: You can customize this high-level summary in your review settings.

- Introduced `deploy_production.yml` with enhanced features including zero-downtime deploy and improved caching.
- Created `deploy_staging.yml` to unify deployment for dev and play environments, also featuring better caching.
- Changed the method of detecting migration file changes by first fetching all changed files and then filtering for migration files using grep
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 17, 2026

📝 Walkthrough

Walkthrough

Adds reusable Docker build workflow and new production and unified staging deploy workflows. Production deploy detects migrations to toggle Heroku preboot, builds via shared workflow, performs Heroku container releases, waits for completion, does best-effort Cloudflare purge and Sentry sourcemap upload, and triggers Storybook. New helper scripts support checks and polling.

Changes

Cohort / File(s) Summary
Reusable Docker build
/.github/workflows/docker_build.yml
New reusable workflow: computes branch_slug/metadata, generates tags & labels (including optional latest), applies branch-scoped cache strategy, builds & pushes images to GHCR, optionally builds/pushes Heroku targets, and exposes outputs like image_url, image_tags, primary_tag, heroku_targets.
Production Deployment
/.github/workflows/deploy_production.yml, scripts/check_migrations_since_deploy.sh, scripts/wait_for_heroku_release.sh
New production workflow: sparse checkout → migration detection (sets has_migrations) → invokes docker_build → installs Heroku CLI → toggles preboot based on migrations → releases container targets using image_url → polls release via wait_for_heroku_release.sh → best-effort Cloudflare purge & Sentry sourcemap upload → always-reset preboot → triggers Storybook.
Staging Deployment (Unified)
/.github/workflows/deploy_staging.yml
New unified staging workflow for dev/play: uses docker_build with environment parameterization, logs into registries, releases Heroku container targets, and waits for release completion; replaces previous dev/play workflows.
PR preview adjustments
/.github/workflows/pr_preview.yml
PR preview updated: inline branch-slug computation, removed old image outputs, replaced inline Docker steps with reusable docker_build usage, and switched downstream references to needs.build-image.outputs.image_url.
Removed / Replaced workflows
/.github/workflows/deploy_dev.yml, /.github/workflows/deploy_play.yml, /.github/workflows/deploy_prod.yml
Old per-environment deploy workflows consolidated/replaced by the new deploy_staging.yml and deploy_production.yml.

Sequence Diagram(s)

sequenceDiagram
    participant GHA as GitHub Actions
    participant GHCR as GHCR (Registry)
    participant Heroku as Heroku
    participant Cloudflare as Cloudflare
    participant Sentry as Sentry

    GHA->>GHA: Checkout repo & run `check_migrations_since_deploy.sh`
    GHA->>GHCR: Invoke reusable `docker_build` → build & push images (emit `image_url`)
    alt has_migrations == true
        GHA->>Heroku: Disable preboot for app
    else
        GHA->>Heroku: Ensure preboot enabled
    end
    GHA->>Heroku: Login registry + release container targets using `image_url`
    loop poll release status
        Heroku-->>GHA: release status (pending/succeeded/failed)
    end
    alt release succeeded
        GHA->>Cloudflare: Purge zone cache (best-effort)
        GHA->>Sentry: Upload sourcemaps via Docker (best-effort)
        GHA->>GHA: Trigger Storybook workflow
        GHA->>Heroku: Reset preboot to safe state (always)
    else release failed
        GHA-->>GHA: Fail workflow
    end
Loading
sequenceDiagram
    participant GHA as GitHub Actions
    participant GHCR as GHCR (Registry)
    participant Heroku as Heroku

    GHA->>GHCR: Invoke `docker_build` (tags, caching, push)
    GHA->>Heroku: Authenticate (registry + CLI)
    GHA->>Heroku: Release staging app containers (targets from build)
    loop poll release status
        Heroku-->>GHA: release status updates
    end
    GHA->>GHA: Mark success/failure
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

  • Implement Preview Deployments #4101 — Related changes to .github/workflows/pr_preview.yml switching the preview workflow to the shared docker_build.yml and updating image outputs.

Suggested reviewers

  • ncarazon
  • hlbmtc

Poem

🐇 I hopped through branches, sniffed the migration trail,
If tables rumble, I whisper preboot pale.
Else I nudge containers, let the images fly,
I purge the clouds and map sourcemaps high.
Tiny paws, big deploys — I cheer, oh my!

🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive Title 'Improve Deployment Workflows' is generic and vague, failing to specify which workflows are improved or the key changes like migration detection and staging unification. Consider a more specific title like 'Add migration detection for zero-downtime deploys and unify staging workflows' to better reflect the main improvements.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed All primary objectives from issue #4094 are met: migration detection with preboot toggling [#4094], unified staging workflow [#4094], and GitHub Actions Docker caching [#4094] are implemented.
Out of Scope Changes check ✅ Passed All changes align with stated objectives: new workflows for production and staging, migration detection scripts, helper scripts for Heroku release polling, and workflow refactoring for PR previews to use centralized Docker builds.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

…in permissions

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @.github/workflows/deploy_production.yml:
- Around line 168-173: The Cloudflare purge step ("Cloudflare: Purge cache")
currently fires a curl POST but ignores the response; update the run block to
capture and validate the response: store the curl output and HTTP status, print
the full response for debugging, parse the JSON and check the HTTP status and
the JSON "success" field, and exit non‑zero (fail the workflow) if the purge did
not succeed so failures are visible in the workflow logs.
- Around line 42-44: The deployment selector stored in LAST_DEPLOYED currently
picks the first deployment for prod_env regardless of outcome; update the gh api
logic so it only considers deployments whose latest status is "success" by: list
deployments for the repo (same gh api call in the LAST_DEPLOYED assignment), for
each deployment fetch its statuses
(repos/:owner/:repo/deployments/:deployment_id/statuses) and ensure the most
recent status.state == "success", then pick the first such deployment and use
its .sha; adjust the command referenced by LAST_DEPLOYED to perform this
filtering (or replace the single jq filter with a two-step query that checks
deployment statuses) so only successful prod_env deployments are considered.
🧹 Nitpick comments (4)
.github/workflows/deploy_production.yml (4)

53-59: Git fetch strategy may silently fail to retrieve required history.

The --shallow-exclude option requires the specified commit to exist in the remote history. If $LAST_DEPLOYED is old or pruned, both fetch commands may fail silently (due to || true), and the subsequent git diff will then trigger safe_default. While the fallback is safe, consider adding a warning when fetch fails to aid debugging.

Proposed improvement for better debugging
          # Fetch only the history needed to compare with last deployment
          # This is faster than fetching full history with fetch-depth: 0
-         git fetch --shallow-exclude="$LAST_DEPLOYED" origin ${{ github.ref_name }} 2>/dev/null || true
-         git fetch --deepen=1 origin ${{ github.ref_name }} 2>/dev/null || true
+         git fetch --shallow-exclude="$LAST_DEPLOYED" origin ${{ github.ref_name }} 2>/dev/null || echo "::debug::shallow-exclude fetch failed, trying deepen"
+         git fetch --deepen=1 origin ${{ github.ref_name }} 2>/dev/null || echo "::debug::deepen fetch also failed"

91-92: Consider pinning Heroku CLI version for reproducibility.

Piping curl to sh is the officially recommended installation method, but it always installs the latest version. For reproducible builds, consider pinning to a specific version or using a pre-built action.

Alternative: Use a community action with version pinning
      - name: Install Heroku CLI
        uses: akhileshns/heroku-deploy@v3.13.15
        with:
          heroku_api_key: ${{ secrets.HEROKU_API_KEY }}
          heroku_app_name: ""  # Just to get CLI installed
          justlogin: true

Or use npm for version control:

      - name: Install Heroku CLI
        run: npm install -g heroku@10.0.0

99-101: Consider implications of latest tag for production.

Using latest tag in production can complicate rollback scenarios since it's overwritten on each deploy. The SHA-prefixed tag provides traceability, which is good. Verify that your rollback strategy uses the SHA-based tags rather than latest.


175-177: Add failure handling for Sentry sourcemaps upload.

If the Sentry upload fails, the workflow will fail at this step. Consider whether a failed sourcemap upload should block the deployment or just warn. If it should not block, add || true or continue-on-error: true.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 915dc8a and 8f2e63b.

📒 Files selected for processing (1)
  • .github/workflows/deploy_production.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: integration-tests
🔇 Additional comments (6)
.github/workflows/deploy_production.yml (6)

17-20: Permissions appropriately scoped.

The explicit permissions block addresses the previously flagged code scanning alert. The permissions follow the principle of least privilege with only the required scopes (contents: read, packages: write, deployments: write).


115-126: Efficient cache reuse from prior build step.

The Heroku target builds correctly reuse the GHA cache populated by the all_runners build. Since the cache is already populated with mode=max, not writing cache again (--cache-to) avoids redundant cache operations.


128-136: Preboot toggle logic is sound.

The conditional correctly disables preboot when migrations are detected (ensuring migrations complete before serving new code) and enables it for zero-downtime deployments when no migrations exist.


142-166: Release wait loop handles edge cases correctly.

The loop properly waits for pending releases and validates both succeeded status and current=true before exiting successfully. The 7.5-minute timeout (90 × 5s) is reasonable for Heroku releases.


179-185: Storybook job permissions are correctly scoped.

The job has explicit contents: read permission and uses secrets: inherit to pass required secrets to the reusable workflow. This addresses the previous code scanning alert about missing permissions.


29-29: No changes needed — actions/checkout@v6 is stable and current.

v6.0.1 was released on December 2, 2025, and is the latest stable version. Using v6 is appropriate.

Likely an incorrect or invalid review comment.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@cemreinanc cemreinanc marked this pull request as draft January 17, 2026 22:02
@cemreinanc cemreinanc marked this pull request as ready for review January 19, 2026 00:19
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @.github/workflows/deploy_production.yml:
- Around line 43-60: The GraphQL query currently inspects deployment statuses
via statuses(first: 1) which doesn't guarantee the latest status; update the
query to use the Deployment.latestStatus field instead of statuses, i.e., in the
query block that requests repository { deployments { nodes { commitOid
statuses(...) } } } replace the statuses selection with latestStatus { state }
and adjust the jq selector to check .latestStatus.state when selecting a
successful deployment commitOid so the script reliably picks the latest
successful deployment.
♻️ Duplicate comments (1)
.github/workflows/deploy_production.yml (1)

184-189: Cloudflare purge response still unchecked.
This is the same concern flagged earlier: the purge call doesn’t validate success, so failures can be silent.

🧹 Nitpick comments (1)
.github/workflows/deploy_production.yml (1)

131-142: Persist build cache for Heroku targets.
--cache-from without --cache-to means target‑specific layers won’t be saved, so cache benefit is limited. Add a cache‑to (optionally with a scope) to keep it warm across runs.

Proposed change
-            docker buildx build \
+            docker buildx build \
               --platform linux/amd64 \
               --target "$target" \
               --tag "registry.heroku.com/${{ env.HEROKU_APP }}/$target" \
               --push \
-              --cache-from type=gha \
+              --cache-from type=gha,scope=heroku-targets \
+              --cache-to type=gha,mode=max,scope=heroku-targets \
               .
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f2e63b and f77729b.

📒 Files selected for processing (1)
  • .github/workflows/deploy_production.yml
🔇 Additional comments (1)
.github/workflows/deploy_production.yml (1)

28-29: No action needed. actions/checkout@v6 is the current stable major version released by GitHub Actions and is properly used in the workflow.

Likely an incorrect or invalid review comment.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@elisescu elisescu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some inline comments

Comment on lines 69 to 70
git fetch --shallow-exclude="$LAST_DEPLOYED" origin ${{ github.ref_name }} 2>/dev/null || true
git fetch --deepen=1 origin ${{ github.ref_name }} 2>/dev/null || true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for choosing not to fail if the fetch commands fail here? Or why the || true?

Copy link
Contributor Author

@cemreinanc cemreinanc Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if they fail, we dont want to abort whole workflow, instead we rely on the git diff below to catch missing history and trigger safe_default

…ling

- Replaced `deploy_prod.yml` and `deploy_dev.yml` with a unified `deploy_production.yml` and `deploy_staging.yml` that utilize a shared Docker build workflow for better caching and consistency.
- Introduced migration detection in the production workflow to toggle preboot settings based on database migrations.
- Updated the PR preview workflow to streamline Docker image building and deployment processes.
- Enhanced Cloudflare cache purge logic to capture and log HTTP response status and errors.
…in permissions

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @.github/workflows/docker_build.yml:
- Around line 138-154: The inline script uses user-controlled github.head_ref
directly which risks shell injection; set the branch value via an action env var
and then read that safe env var inside the script (e.g., add env: BRANCH_INPUT:
${{ github.head_ref || github.ref_name }} on the step with id "vars"), then in
the run block use BRANCH="$BRANCH_INPUT" (or BRANCH="$(printf '%s'
"$BRANCH_INPUT")") and continue computing BRANCH_SLUG from BRANCH; keep
IMAGE_NAME/IMAGE_NAME_LOWER handling the same but ensure all expansions are
quoted to prevent word-splitting.

In @.github/workflows/pr_preview.yml:
- Around line 127-142: The step currently injects github.head_ref directly into
the inline script (used via BRANCH_NAME) which is a script-injection risk;
change the step to pass the branch ref through a GitHub Actions environment
variable (e.g., set an env: BRANCH_REF: "${{ github.head_ref || github.ref_name
}}" on the step) and then use that env var inside the script to compute
BRANCH_NAME, BRANCH_SLUG and PREVIEW_ID; keep the same slugification logic but
ensure you reference BRANCH_REF (not github.head_ref) when creating BRANCH_NAME
and when writing outputs like fly_app, neon_branch and redis_name.
🧹 Nitpick comments (6)
scripts/wait_for_heroku_release.sh (1)

39-57: Handle edge cases for empty releases or null status.

If the Heroku app has no releases yet, or if the JSON parsing returns null or empty values, the script may produce unexpected behavior. The status variable could be null or empty, causing neither the pending check nor the succeeded check to match, which would incorrectly report a failure.

Proposed defensive handling
 for ((i = 1; i <= MAX_ATTEMPTS; i++)); do
   json=$(heroku releases --json -a "$APP_NAME")
   status=$(echo "$json" | jq -r '.[0].status')
   current=$(echo "$json" | jq -r '.[0].current')
 
+  if [ -z "$status" ] || [ "$status" == "null" ]; then
+    echo "Warning: Could not determine release status (attempt $i/$MAX_ATTEMPTS)..."
+    sleep "$POLL_INTERVAL"
+    continue
+  fi
+
   if [ "$status" == "pending" ]; then
     echo "Waiting for release to finish (attempt $i/$MAX_ATTEMPTS)..."
     sleep "$POLL_INTERVAL"
     continue
   fi
scripts/check_migrations_since_deploy.sh (1)

57-73: Consider increasing deployment fetch limit or documenting the limitation.

The GraphQL query fetches only the first 20 deployments. If there have been more than 20 consecutive failed or non-success deployments, no successful deployment will be found, triggering the safe default. While this is unlikely in normal operations, it's worth documenting or increasing the limit.

Proposed adjustment
-      deployments(environments: [$env], first: 20, orderBy: {field: CREATED_AT, direction: DESC}) {
+      deployments(environments: [$env], first: 50, orderBy: {field: CREATED_AT, direction: DESC}) {
.github/workflows/docker_build.yml (2)

174-186: Quote variables in shell comparisons per shellcheck SC2086.

The shell variables in the conditional comparisons should be quoted to prevent word splitting and globbing issues.

Proposed fix
       - name: Compute cache scope
         id: cache
         run: |
           BRANCH_SLUG="${{ steps.vars.outputs.branch_slug }}"
           
           # Determine if this is the main branch (writes to main cache)
-          if [ "${{ github.ref_name }}" = "main" ] || [ "${{ github.ref_name }}" = "master" ]; then
+          REF_NAME="${{ github.ref_name }}"
+          if [ "$REF_NAME" = "main" ] || [ "$REF_NAME" = "master" ]; then
             echo "cache_write_tag=buildcache-main" >> $GITHUB_OUTPUT
           else
-            echo "cache_write_tag=buildcache-${BRANCH_SLUG}" >> $GITHUB_OUTPUT
+            echo "cache_write_tag=buildcache-${BRANCH_SLUG}" >> "$GITHUB_OUTPUT"
           fi
           
-          echo "cache_scope=${BRANCH_SLUG}" >> $GITHUB_OUTPUT
+          echo "cache_scope=${BRANCH_SLUG}" >> "$GITHUB_OUTPUT"

205-222: Quote the Heroku targets variable in the for loop.

The heroku_targets input is unquoted in the for loop. While unlikely to cause issues with the default value, quoting would be safer if custom targets with special characters were ever used.

Proposed fix
       - name: Build and push Heroku targets
         if: inputs.heroku_app != ''
+        env:
+          HEROKU_TARGETS: ${{ inputs.heroku_targets }}
         run: |
           set -e
           IMAGE_NAME="${{ steps.vars.outputs.image_name }}"
           CACHE_SCOPE="${{ steps.cache.outputs.cache_scope }}"
           
-          for target in ${{ inputs.heroku_targets }}; do
+          for target in $HEROKU_TARGETS; do
             echo "Building Heroku target: $target"
.github/workflows/deploy_staging.yml (1)

57-58: Consider making the wait timeout configurable per environment.

Production uses 90 attempts while staging uses 60. If staging environments have different release characteristics, this is fine. Otherwise, consider using a consistent timeout or making it configurable.

.github/workflows/deploy_production.yml (1)

117-119: Consider splitting the Sentry upload command for readability.

The single-line Docker command is quite long. Breaking it into multiple lines would improve readability.

Proposed formatting
       - name: "Sentry: Upload sourcemaps"
         run: |
-          docker run -w /app/front_end -e SENTRY_AUTH_TOKEN=${{ secrets.SENTRY_AUTH_TOKEN }} ${{ needs.build.outputs.image_url }} npx sentry-cli sourcemaps upload /app/front_end/.next/ --project metaculus-frontend --org metaculus
+          docker run \
+            -w /app/front_end \
+            -e SENTRY_AUTH_TOKEN=${{ secrets.SENTRY_AUTH_TOKEN }} \
+            ${{ needs.build.outputs.image_url }} \
+            npx sentry-cli sourcemaps upload /app/front_end/.next/ \
+              --project metaculus-frontend \
+              --org metaculus
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2c89da4 and 8d85bc1.

📒 Files selected for processing (6)
  • .github/workflows/deploy_production.yml
  • .github/workflows/deploy_staging.yml
  • .github/workflows/docker_build.yml
  • .github/workflows/pr_preview.yml
  • scripts/check_migrations_since_deploy.sh
  • scripts/wait_for_heroku_release.sh
🧰 Additional context used
🪛 actionlint (1.7.10)
.github/workflows/docker_build.yml

140-140: shellcheck reported issue in this script: SC2086:info:12:38: Double quote to prevent globbing and word splitting

(shellcheck)


140-140: shellcheck reported issue in this script: SC2086:info:4:42: Double quote to prevent globbing and word splitting

(shellcheck)


140-140: "github.head_ref" is potentially untrusted. avoid using it directly in inline scripts. instead, pass it through an environment variable. see https://docs.github.com/en/actions/reference/security/secure-use#good-practices-for-mitigating-script-injection-attacks for more details

(expression)


176-176: shellcheck reported issue in this script: SC2086:info:10:38: Double quote to prevent globbing and word splitting

(shellcheck)


176-176: shellcheck reported issue in this script: SC2086:info:5:45: Double quote to prevent globbing and word splitting

(shellcheck)


176-176: shellcheck reported issue in this script: SC2086:info:7:55: Double quote to prevent globbing and word splitting

(shellcheck)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build Docker Image / Build Docker Image
🔇 Additional comments (6)
scripts/check_migrations_since_deploy.sh (1)

1-109: Well-designed migration detection with safe fallbacks.

The script follows a fail-safe approach by defaulting to has_migrations=true (disabling preboot) when any uncertainty exists. The use of latestStatus.state addresses the previous review feedback about reliable status ordering. The separation into a standalone script improves testability as requested.

.github/workflows/pr_preview.yml (1)

147-154: Good consolidation to shared Docker build workflow.

The migration to the reusable docker_build.yml workflow reduces code duplication and ensures consistent build configuration across all deployment targets.

.github/workflows/deploy_staging.yml (1)

26-58: Clean unification of staging workflows.

The workflow successfully consolidates dev and play deployments into a single parameterized workflow, reducing maintenance overhead. The sparse checkout for the scripts directory is an efficient optimization.

.github/workflows/deploy_production.yml (3)

62-70: Preboot toggle logic is correct.

The migration-based preboot configuration correctly disables preboot when migrations exist (ensuring migrations run before new code) and enables it for zero-downtime deployments when no migrations are present. This aligns with the PR objectives.


85-115: Robust Cloudflare cache purge implementation.

The error handling captures HTTP status and validates the JSON response, addressing the previous review feedback. Using exit 0 with ::warning:: appropriately allows the deployment to succeed while alerting on cache purge issues.


121-125: Good safety mechanism to reset preboot state.

Using always() ensures preboot is disabled even if previous steps fail, maintaining a safe default state for legacy deployment scripts as mentioned in the past review discussion.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 20, 2026

🚀 Preview Environment

Your preview environment is ready!

Resource Details
🌐 Preview URL https://metaculus-pr-4095-chore-deploy-actions-improveme.fly.dev
📦 Docker Image ghcr.io/metaculus/metaculus:chore-deploy-actions-improvements-d8f75f2
🗄️ PostgreSQL NeonDB branch preview/pr-4095-chore-deploy-actions-improveme
Redis Fly Redis mtc-redis-pr-4095-chore-deploy-actions-improveme

Details

  • Commit: d8f75f28c8b24f15f04a0f2352555d37b22bd97a
  • Branch: chore/deploy-actions-improvements
  • Fly App: metaculus-pr-4095-chore-deploy-actions-improveme

ℹ️ Preview Environment Info

Isolation:

  • PostgreSQL and Redis are fully isolated from production
  • Each PR gets its own database branch and Redis instance
  • Changes pushed to this PR will trigger a new deployment

Limitations:

  • Background workers and cron jobs are not deployed in preview environments
  • If you need to test background jobs, use Heroku staging environments

Cleanup:

  • This preview will be automatically destroyed when the PR is closed

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @.github/workflows/deploy_production.yml:
- Around line 100-108: The deploy job's "Sentry: Upload sourcemaps" step runs
docker run which pulls the GHCR image but lacks authentication; add a new step
that uses docker/login-action@v3 to perform docker login to ghcr.io (using
github.actor and secrets.GITHUB_TOKEN) immediately before the "Sentry: Upload
sourcemaps" step so the subsequent docker run can pull the private image
successfully.
🧹 Nitpick comments (2)
.github/workflows/deploy_production.yml (2)

1-6: Track the TODO for retiring the legacy workflow.
Consider creating a small tracking issue so this doesn’t linger indefinitely.


28-36: Confirm migration check works with a shallow checkout.
If the script relies on git diff between SHAs, the default depth of 1 can make the comparison incomplete. If that’s the case, fetch full history (or the base SHA) here.

🛠️ Suggested tweak (if history is required)
       - name: Checkout code
         uses: actions/checkout@v6
+        with:
+          fetch-depth: 0
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 71ee2e4 and ca839d0.

📒 Files selected for processing (1)
  • .github/workflows/deploy_production.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Build Docker Image / Build Docker Image
  • GitHub Check: integration-tests
  • GitHub Check: Frontend Checks
  • GitHub Check: Backend Checks
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (8)
.github/workflows/deploy_production.yml (8)

13-16: Scoped permissions look good.


41-47: Reusable build workflow integration looks good.


62-70: Preboot toggle logic matches migration detection.


78-83: Deploy + release wait sequencing looks solid.


85-98: Cloudflare purge step is clear and safely non-fatal.


110-114: Preboot reset safeguard is good.


116-122: Storybook deployment trigger looks good.


25-27: actions/checkout@v6 is a valid released major version. v6.0.2 was released on Jan 9, 2026, and is the latest major release. Both checkouts at lines 25-27 and 72-74 are correctly pinned and will not fail due to version unavailability. No changes required.

Likely an incorrect or invalid review comment.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +100 to +108
- name: "Sentry: Upload sourcemaps"
run: |
docker run \
-w /app/front_end \
-e SENTRY_AUTH_TOKEN=${{ secrets.SENTRY_AUTH_TOKEN }} \
${{ needs.build.outputs.image_url }} \
npx sentry-cli sourcemaps upload /app/front_end/.next/ \
--project metaculus-frontend \
--org metaculus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd .github/workflows && wc -l deploy_production.yml

Repository: Metaculus/metaculus

Length of output: 88


🏁 Script executed:

cat -n .github/workflows/deploy_production.yml

Repository: Metaculus/metaculus

Length of output: 4970


🏁 Script executed:

cat -n .github/workflows/docker_build.yml

Repository: Metaculus/metaculus

Length of output: 10285


🏁 Script executed:

# Check if there's any docker login in the deploy job or elsewhere
grep -n "docker login\|docker/login-action" .github/workflows/deploy_production.yml

Repository: Metaculus/metaculus

Length of output: 45


🏁 Script executed:

# Check if the repository is public or private
git ls-files | head -20

Repository: Metaculus/metaculus

Length of output: 686


Add docker login to GHCR before docker run in the deploy job.

The deploy job runs on a fresh runner without inherited authentication from the build job. When docker run attempts to pull the GHCR image (a private registry for this repository), it will fail without prior docker login. Add a docker/login-action step before the Sentry upload step:

- name: Login to GitHub Container Registry
  uses: docker/login-action@v3
  with:
    registry: ghcr.io
    username: ${{ github.actor }}
    password: ${{ secrets.GITHUB_TOKEN }}
🤖 Prompt for AI Agents
In @.github/workflows/deploy_production.yml around lines 100 - 108, The deploy
job's "Sentry: Upload sourcemaps" step runs docker run which pulls the GHCR
image but lacks authentication; add a new step that uses docker/login-action@v3
to perform docker login to ghcr.io (using github.actor and secrets.GITHUB_TOKEN)
immediately before the "Sentry: Upload sourcemaps" step so the subsequent docker
run can pull the private image successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve Deployment Workflows

3 participants