Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 76 additions & 7 deletions .circleci/deploy.sh
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,50 @@ wait_for_deployment() {
return 0
}

# Wait for the current deployment to fully complete (all instances replaced)
wait_for_deployment_complete() {
local app_name="$1"
local max_wait=900 # 15 minutes max for full deployment
local wait_interval=15
local waited=0

echo "Waiting for deployment of $app_name to complete..."

local app_guid=$(cf app "$app_name" --guid)
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If cf app "$app_name" --guid fails, the function will continue with an empty app_guid. While the empty string check on line 256 handles this case for the retry logic, in the wait_for_deployment_complete function (line 99), failure to get the app GUID will result in an API call with an empty GUID parameter, which may return unexpected results or errors. Consider adding error handling to fail fast if the app GUID cannot be retrieved.

Suggested change
local app_guid=$(cf app "$app_name" --guid)
local app_guid
if ! app_guid=$(cf app "$app_name" --guid 2>/dev/null) || [ -z "$app_guid" ]; then
echo "✗ Failed to retrieve GUID for app '$app_name'. Aborting deployment wait."
return 1
fi

Copilot uses AI. Check for mistakes.

while [ $waited -lt $max_wait ]; do
# Get the most recent deployment status
local deployment_info=$(cf curl "/v3/deployments?app_guids=${app_guid}&order_by=-created_at&per_page=1" 2>/dev/null)
local status=$(echo "$deployment_info" | grep -o '"value":"[^"]*"' | head -1 | cut -d'"' -f4 || echo "")
local reason=$(echo "$deployment_info" | grep -o '"reason":"[^"]*"' | head -1 | cut -d'"' -f4 || echo "")
Comment on lines +104 to +105
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The grep pattern '"value":"[^"]*"' is attempting to extract the status value from JSON, but this pattern will match the first occurrence of any field named "value" in the entire JSON response, not necessarily the status value. If there are multiple fields named "value" in the JSON structure, this could extract the wrong field. The same issue exists for the "reason" field extraction. Consider using a JSON parser like jq for more reliable extraction, or use a more specific grep pattern that includes the full path to the status field.

Suggested change
local status=$(echo "$deployment_info" | grep -o '"value":"[^"]*"' | head -1 | cut -d'"' -f4 || echo "")
local reason=$(echo "$deployment_info" | grep -o '"reason":"[^"]*"' | head -1 | cut -d'"' -f4 || echo "")
local status=$(echo "$deployment_info" | jq -r '.resources[0].status.value // ""' 2>/dev/null || echo "")
local reason=$(echo "$deployment_info" | jq -r '.resources[0].status.reason // ""' 2>/dev/null || echo "")

Copilot uses AI. Check for mistakes.

if [ "$status" == "FINALIZED" ]; then
if [ "$reason" == "DEPLOYED" ]; then
echo "✓ Deployment completed successfully"
return 0
elif [ "$reason" == "CANCELED" ]; then
echo "✗ Deployment was canceled"
return 1
else
echo "✗ Deployment finalized with reason: $reason"
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the deployment completes with an unexpected reason (line 115), the error message shows the reason but doesn't provide guidance on what to do next. Consider logging more diagnostic information such as the full deployment status or suggesting manual investigation steps, to help operators understand what went wrong.

Suggested change
echo "✗ Deployment finalized with reason: $reason"
echo "✗ Deployment finalized with unexpected reason: $reason"
echo "Full deployment response from Cloud Foundry:"
echo "$deployment_info"
echo "Please investigate this deployment manually, for example by running:"
echo " cf curl \"/v3/deployments?app_guids=${app_guid}&order_by=-created_at&per_page=1\""

Copilot uses AI. Check for mistakes.
return 1
fi
fi

if [ "$status" == "ACTIVE" ]; then
echo "Deployment in progress (status: $status), waiting ${wait_interval}s... (waited ${waited}s of ${max_wait}s)"
else
echo "Deployment status: $status, reason: $reason"
fi

sleep $wait_interval
waited=$((waited + wait_interval))
done

echo "Warning: Timed out waiting for deployment to complete after ${max_wait}s"
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timeout message indicates a warning, but the function returns 1 (failure), which will cause the deployment to be retried. However, a timeout after 15 minutes might indicate a stuck deployment rather than a transient failure. Consider whether the deployment should be explicitly canceled before returning failure, to prevent accumulating stuck deployments. Alternatively, document that stuck deployments will remain active and may need manual cleanup.

Suggested change
echo "Warning: Timed out waiting for deployment to complete after ${max_wait}s"
echo "Warning: Timed out waiting for deployment to complete after ${max_wait}s; deployment may still be active or stuck and may require manual cleanup."

Copilot uses AI. Check for mistakes.
return 1
}

# Run migrations as a CF task and wait for completion
run_migrations() {
local app_name="$1"
Expand Down Expand Up @@ -184,22 +228,47 @@ cf_push_with_retry() {
set +e
if [ -n "$manifest_path" ]; then
echo "Using manifest: $manifest_path"
cf push "$app_name" -f "$manifest_path" --strategy rolling -t 180
cf push "$app_name" -f "$manifest_path" --strategy rolling -t 180 --no-wait
else
cf push "$app_name" --strategy rolling -t 180
cf push "$app_name" --strategy rolling -t 180 --no-wait
fi
exit_code=$?
set -e

if [ $exit_code -eq 0 ]; then
echo "Successfully pushed $app_name"
release_deploy_lock "$app_name"
trap - EXIT # Clear the trap
return 0
echo "Push initiated successfully, waiting for full deployment to complete..."
if wait_for_deployment_complete "$app_name"; then
echo "Successfully deployed $app_name"
release_deploy_lock "$app_name"
trap - EXIT # Clear the trap
return 0
else
echo "Deployment did not complete successfully"
# Continue to retry logic below
fi
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The else block at line 245 handles the case where wait_for_deployment_complete returns failure, but there's no similar else block for when exit_code is non-zero at line 238. This means if cf push fails (exit_code != 0), the code immediately jumps to the retry check at line 251 without any specific error handling or logging for the push failure itself. While the error message at line 252 mentions "Push failed", it might be helpful to add an explicit else block to log the specific push failure and exit code before entering retry logic.

Suggested change
fi
fi
else
echo "cf push command failed for $app_name with exit code: $exit_code"
# Continue to retry logic below

Copilot uses AI. Check for mistakes.
fi

if [ $i -lt $max_retries ]; then
echo "Push failed (exit code: $exit_code), waiting ${retry_delay}s before retry..."
echo "Push failed or deployment incomplete (exit code: $exit_code), checking for active deployments..."

# Check if there's an active deployment that we should wait for instead of retrying
local app_guid=$(cf app "$app_name" --guid 2>/dev/null || echo "")
if [ -n "$app_guid" ]; then
local active_deployment=$(cf curl "/v3/deployments?app_guids=${app_guid}&status_values=ACTIVE" 2>/dev/null | grep -c '"ACTIVE"' || echo "0")
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using grep -c '"ACTIVE"' counts occurrences of the literal string "ACTIVE" anywhere in the JSON response, which could match fields other than the deployment status (e.g., in comments, descriptions, or other unrelated fields). This could lead to false positives where the script thinks there's an active deployment when there isn't one. Consider using a more precise check, such as checking for the status field specifically with a JSON parser, or using a more specific grep pattern that includes the field context.

Suggested change
local active_deployment=$(cf curl "/v3/deployments?app_guids=${app_guid}&status_values=ACTIVE" 2>/dev/null | grep -c '"ACTIVE"' || echo "0")
local active_deployment=$(cf curl "/v3/deployments?app_guids=${app_guid}&status_values=ACTIVE" 2>/dev/null | jq -r '.pagination.total_results // 0' 2>/dev/null || echo "0")

Copilot uses AI. Check for mistakes.

if [ "$active_deployment" -gt 0 ]; then
echo "Active deployment detected, waiting for it to complete instead of retrying..."
if wait_for_deployment_complete "$app_name"; then
echo "Existing deployment completed successfully"
release_deploy_lock "$app_name"
trap - EXIT
return 0
fi
echo "Existing deployment did not complete successfully, will retry..."
fi
fi

echo "Waiting ${retry_delay}s before retry..."
sleep $retry_delay
# Re-check for in-progress deployments before retrying
wait_for_deployment "$app_name"
Expand Down
Loading