Skip to content

Conversation

@okkez
Copy link
Contributor

@okkez okkez commented Jul 18, 2025

Tasks executed with FARGATE_SPOT may be interrupted. When interrupted, the following log is output.

Task wrapbox_test is failed. task=arn:aws:ecs:xxx:xxx:task/xxx/xxx, cmd_index=60, exit_code=2, task_stopped_reason=Your Spot Task was interrupted., container_stopped_reason= (Wrapbox::Runner::Ecs::ExecutionFailure)

Handling this log, the Spot Task is retried in the wrapbox. This

@okkez okkez requested a review from Copilot July 18, 2025 06:27
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for automatically retrying FARGATE_SPOT tasks that are interrupted by AWS. The implementation detects when a task is interrupted by matching the "Your Spot Task was interrupted" message in the task's stopped reason and triggers a retry mechanism.

  • Adds detection for FARGATE_SPOT task interruptions using regex pattern matching
  • Implements retry logic for interrupted spot tasks similar to existing host termination handling
  • Updates CI configuration to test against newer Ruby versions (3.3, 3.4) while removing older versions (2.7, 3.0, 3.1)

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
lib/wrapbox/runner/ecs.rb Adds regex constant and retry logic for detecting and handling interrupted FARGATE_SPOT tasks
.github/workflows/main.yml Updates Ruby version matrix for CI testing to include newer versions
Comments suppressed due to low confidence (1)

lib/wrapbox/runner/ecs.rb:31

  • [nitpick] The constant name SPOT_TASK_INTERRUPTED should include 'REGEXP' suffix to match the naming pattern of HOST_TERMINATED_REASON_REGEXP for consistency.
      SPOT_TASK_INTERRUPTED = /Your Spot Task was interrupted/

WAIT_DELAY = 5
TERM_TIMEOUT = 120
HOST_TERMINATED_REASON_REGEXP = /Host EC2.*terminated/
SPOT_TASK_INTERRUPTED = /Your Spot Task was interrupted/
Copy link

Copilot AI Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern should end with a word boundary or be more specific to avoid potential false matches if AWS changes the message format slightly. Consider using /Your Spot Task was interrupted\b/ or anchoring the pattern.

Suggested change
SPOT_TASK_INTERRUPTED = /Your Spot Task was interrupted/
SPOT_TASK_INTERRUPTED = /Your Spot Task was interrupted\b/

Copilot uses AI. Check for mistakes.
@okkez okkez requested review from akihiro17 and joker1007 July 18, 2025 06:39
@okkez okkez self-assigned this Jul 18, 2025
@okkez
Copy link
Contributor Author

okkez commented Jul 18, 2025

@joker1007 @akihiro17 Can you take a look?

Copy link
Contributor

@joker1007 joker1007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. LGTM!

@okkez okkez merged commit baa849b into master Jul 21, 2025
6 checks passed
@okkez okkez deleted the support-retrying-interrupted-spot-task branch July 21, 2025 23:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants