-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Summary:
Currently, configuring a Coiled cluster requires manually specifying n_workers, worker_vm_types, nthreads, and memory_limit. This can lead to misconfigurations (e.g., specifying more threads than the vCPUs or too much memory per worker) and inefficient scaling. It would be useful to have built-in validation against AWS instance specifications.
Problem:
- Users must manually calculate sensible values for
nthreadsandmemory_limitbased on the worker VM type. - Coiled can scale, but there’s no safety check to prevent invalid configurations.
- Different AWS instance types have different vCPU, memory, and network/EBS bandwidth limits.
- Users may overcommit resources or choose inefficient settings, leading to wasted cost or runtime errors.
Proposed solution:
-
Provide a JSON/YAML database of AWS instance types with relevant specs (
vCPU,memory). -
Implement optional validation logic in the Coiled client (or helper function) that:
- Checks that
nthreads <= vCPU - Checks that
memory_limit <= memory - Warns if requested resources exceed instance capabilities
- Checks that
-
Optionally suggest recommended defaults for
nthreadsandmemory_limitbased on instance type and number of workers.
Benefits:
- Prevents misconfigured clusters.
- Helps users select optimal Coiled settings without guessing.
- Makes it easier to scale efficiently on AWS.
Metadata
Metadata
Assignees
Labels
No labels