Skip to content

Add feature to validate Coiled cluster configuration against AWS instance specifications #239

@lbesnard

Description

@lbesnard

Summary:
Currently, configuring a Coiled cluster requires manually specifying n_workers, worker_vm_types, nthreads, and memory_limit. This can lead to misconfigurations (e.g., specifying more threads than the vCPUs or too much memory per worker) and inefficient scaling. It would be useful to have built-in validation against AWS instance specifications.

Problem:

  • Users must manually calculate sensible values for nthreads and memory_limit based on the worker VM type.
  • Coiled can scale, but there’s no safety check to prevent invalid configurations.
  • Different AWS instance types have different vCPU, memory, and network/EBS bandwidth limits.
  • Users may overcommit resources or choose inefficient settings, leading to wasted cost or runtime errors.

Proposed solution:

  • Provide a JSON/YAML database of AWS instance types with relevant specs (vCPU, memory).

  • Implement optional validation logic in the Coiled client (or helper function) that:

    • Checks that nthreads <= vCPU
    • Checks that memory_limit <= memory
    • Warns if requested resources exceed instance capabilities
  • Optionally suggest recommended defaults for nthreads and memory_limit based on instance type and number of workers.

Benefits:

  • Prevents misconfigured clusters.
  • Helps users select optimal Coiled settings without guessing.
  • Makes it easier to scale efficiently on AWS.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions