Skip to content

Conversation

@ryanporterfield
Copy link

This PR replaces #5. It includes the code necessary for retrying tasks on the SB platform as well as a method in the Platform parent class in the base_platform.py. The monitor_task method is platform agnostic and will wait until a task is finished, retrying it upon failure. If a dictionary of instance types is provided, it will retry the task in order of instances provided (incrementing with each failure). If no dictionary is provided, it will retry the task on the same instance up to the sb_max_retry_count which defaults to 3.

Note: I left the platform specific arguments out of the monitor_task method in the Platform class. I'm not sure if this is something you would want to mirror in the arvados_platform.py or not but just wanted to note that here.

@golharam golharam self-assigned this Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants