Skip to content

feat: add Oracle OKE recipe overlays #429

@mchmarny

Description

@mchmarny

Summary

As part of the ongoing effort to expand AICR recipe coverage across CSP and NCP managed Kubernetes offerings, we should add support for Oracle Kubernetes Engine (OKE). AICR already supports AKS (Azure), EKS (AWS), and GKE (Google Cloud) — OKE is the next logical addition to round out coverage of major cloud providers.

Scope

  • Add base oke.yaml overlay with OKE-specific service criteria and defaults
  • Add intent overlays: oke-training.yaml, oke-inference.yaml
  • Add accelerator-specific overlays for supported GPU types (e.g., h100-oke-training.yaml, h100-oke-inference.yaml)
  • Add OS/framework variant overlays as needed (e.g., h100-oke-ubuntu-training.yaml)
  • Add OKE-specific component values if GPU Operator or other components require OKE-specific configuration
  • Add KWOK node simulation scripts for OKE node topologies
  • Validate all new overlays via KWOK tests

Context

This is part of a broader effort to ensure AICR provides validated GPU-accelerated configurations across all major managed Kubernetes platforms:

Provider Service Status
AWS EKS Supported
Azure AKS Supported
Google Cloud GKE Supported
Oracle Cloud OKE This issue

Notes

  • Follow existing overlay patterns established by EKS/AKS/GKE
  • Reference ADR-003 (docs: add ADR-003 for scaling KWOK recipe tests #424) for CI scaling considerations when adding new overlays
  • OKE-specific GPU node labels, taints, and topology should be captured in KWOK simulation scripts

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions