Skip to content

Cloud-agnostic team-operator: remove cloud provider dependencies #109

@ian-flores

Description

@ian-flores

Goal

Make team-operator fully cloud-agnostic by removing all direct cloud provider dependencies (AWS, Azure) and relying solely on Kubernetes-native abstractions. Cloud-specific translation is pushed to the infrastructure layer (PTD), not the operator.

Motivation

  • Running on a new cloud requires operator code changes
  • Running locally on kind requires workarounds
  • The operator cannot be tested in isolation from cloud infrastructure
  • Cloud provider SDKs are compiled into the operator binary (attack surface, binary size)

Target Architecture

The operator interacts only with standard Kubernetes APIs:

  • Storage: PVCs + StorageClasses (not direct FSx/NetApp PV creation)
  • IAM: Plain ServiceAccounts with optional passthrough annotations (not hardcoded IRSA ARN computation)
  • Secrets: K8s Secrets only (not AWS Secrets Manager SDK calls or SecretProviderClass)
  • Ingress: Gateway API HTTPRoute (not Ingress + Traefik Middleware CRDs)

Cloud translation happens via infrastructure components:

  • nfs-subdir-external-provisioner (storage)
  • EKS Pod Identity / Azure Workload Identity (IAM)
  • external-secrets-operator (secrets)
  • Traefik Gateway API provider (ingress)

Design Document

Full design doc with all architectural decisions: thoughts/shared/plans/2026-02-26-cloud-agnostic-team-operator.md in the ptd-workspace repo.

Implementation Status (Phase 1 — Backward Compatible)

All Phase 1 work was completed and is available in draft PRs. Every change is additive and backward-compatible — existing CRs continue to work unchanged.

team-operator PRs

PR Track Branch Status
#101 Storage: PVC + StorageClass cloud-agnostic-storage CI green, reviewed
#102 IAM: ServiceAccountName, Annotations, PodLabels cloud-agnostic-iam CI green, reviewed
#103 Secrets: K8s Secret path cloud-agnostic-secrets CI green, reviewed
#104 Gateway API: HTTPRoute dual-path cloud-agnostic-gateway CI green, reviewed
#105 Migration runbook migration-runbook Reviewed
#106 Kind cloud-agnostic dev setup kind-cloud-agnostic Reviewed

ptd PRs (posit-dev/ptd)

PR Track Branch Status
#146 AWS + Azure infra + Site CR wiring cloud-agnostic-infra CI green
#153 Gateway API PTD wiring cloud-agnostic-gateway-ptd
#154 Infrastructure tests (28 tests) cloud-agnostic-tests

Decisions Made

  • Storage: nfs-subdir-external-provisioner with annotation-based pathPattern (no data migration needed)
  • IAM (AWS): EKS Pod Identity (no SA annotations needed)
  • IAM (Azure): Passthrough serviceAccountAnnotations + podLabels for Workload Identity
  • Secrets: external-secrets-operator syncs cloud secrets → K8s Secrets
  • Ingress: Gateway API with Traefik provider (Traefik stays as controller)
  • SA name contract: Explicit serviceAccountName field per product in CR spec
  • CRD deprecation: Keep old fields until all clusters migrated
  • Traefik upgrade: Not needed — workload clusters already on v3

Remaining Work

  • Merge Phase 1 PRs (operator + PTD)
  • Enable feature flags on staging cluster and test end-to-end
  • Roll out to production clusters incrementally
  • Phase 3: Remove deprecated CRD fields after all clusters migrated

Branches (preserved)

All implementation branches are preserved in both repos. To resume work, check out any branch and continue from where it left off.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions