Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
669d1e3
Add cloud-agnostic infrastructure components for team-operator
ian-flores Feb 26, 2026
d3618dc
Address review findings (job 220)
ian-flores Feb 26, 2026
6a1b9cd
Address review findings (job 230)
ian-flores Feb 26, 2026
f6becb6
Address review findings (job 233)
ian-flores Feb 26, 2026
e20e5fc
Address review findings (job 244)
ian-flores Feb 26, 2026
1776e6f
Address review findings (job 248)
ian-flores Feb 26, 2026
78b6150
Address review findings (job 253)
ian-flores Feb 26, 2026
4ce5e48
Address review findings (job 260)
ian-flores Feb 26, 2026
0d0ca46
fix: satisfy ruff TRY003/EM102 lint rules for exception messages
ian-flores Feb 26, 2026
676a251
Address review findings (job 271)
ian-flores Feb 26, 2026
f21bcf6
Address review findings (job 281)
ian-flores Feb 26, 2026
01d83cd
Address review findings (job 290)
ian-flores Feb 26, 2026
60552f0
Address review findings (job 299)
ian-flores Feb 26, 2026
88c53ae
Address review findings (job 310)
ian-flores Feb 26, 2026
c18428f
Address review findings (job 318)
ian-flores Feb 26, 2026
1db7956
Address review findings (job 326)
ian-flores Feb 26, 2026
16b3ad6
Address review findings (job 330)
ian-flores Feb 26, 2026
ecbdeaa
Address review findings (job 335)
ian-flores Feb 26, 2026
b266267
Address review findings (job 349)
ian-flores Feb 26, 2026
39565d9
Address review findings (job 357)
ian-flores Feb 26, 2026
050c4c9
Address review findings (job 365)
ian-flores Feb 26, 2026
66d8e71
Address review findings (job 374)
ian-flores Feb 26, 2026
1b1fc84
Address review findings (job 386)
ian-flores Feb 26, 2026
4230b7e
Address review findings (job 394)
ian-flores Feb 26, 2026
f6828cf
Address review findings (job 399)
ian-flores Feb 26, 2026
1fa445a
Address review findings (job 404)
ian-flores Feb 26, 2026
61bf939
Address review findings (job 411)
ian-flores Feb 26, 2026
df0f651
Address review findings (job 414)
ian-flores Feb 26, 2026
85fe50b
Address review findings (job 420)
ian-flores Feb 26, 2026
b90b3c6
feat: wire cloud-agnostic operator fields into Site CR construction
ian-flores Feb 27, 2026
d0db259
test: add tests for cloud-agnostic infrastructure code
ian-flores Feb 27, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions docs/KNOWN_ISSUES.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,4 +117,23 @@ exit
- Changes made directly via Pulumi CLI may be overwritten by subsequent `ptd ensure` runs if they conflict with your configuration
- This is an advanced troubleshooting tool - use it when the standard PTD commands aren't sufficient


### External Secrets Operator: ClusterSecretStore Fails on First Run

**The Problem:**
When enabling `enable_external_secrets_operator` on a fresh cluster, the `ClusterSecretStore` resource
may fail to apply with `no matches for kind "ClusterSecretStore"`. This happens because Pulumi registers
the ESO HelmChart CR but the CRDs installed by the chart have not yet converged before Pulumi attempts
to create the `ClusterSecretStore`.

**Why It Happens:**
`depends_on` the HelmChart CR only ensures the CR is accepted by the API server, not that the ESO
controller has finished installing its CRDs. On a fresh cluster, CRD propagation can take several
minutes. Pulumi will retry for up to 10 minutes via `CustomTimeouts(create="10m")`, but may still
time out on very slow clusters or under resource pressure.

**The Solution:**
Re-run `ptd ensure` after the initial failure. By that point the CRDs will be available and the
`ClusterSecretStore` will apply successfully.

---
72 changes: 72 additions & 0 deletions docs/team-operator/kind-site-example.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Example Site CR for kind local development
# Demonstrates cloud-agnostic configuration using standard Kubernetes resources
#
# Prerequisites:
# - kind cluster with standard StorageClass (default local-path-provisioner)
# - K8s Secrets created manually (dev-secrets, workload-secrets)
# - PostgreSQL database accessible from the cluster
#
# Usage:
# kubectl apply -f kind-site-example.yaml

apiVersion: core.posit.team/v1beta1
kind: Site
metadata:
name: dev
namespace: posit-team
labels:
app.kubernetes.io/instance: dev
spec:
# Cloud-agnostic storage: uses kind's default StorageClass
storageClassName: standard

# Cloud-agnostic secrets: reference K8s Secrets by name
# Create these manually for kind:
# kubectl create secret generic dev-secrets -n posit-team \
# --from-literal=dev.lic="..." \
# --from-literal=connect-apikey="..." \
# --from-literal=admin_token="..."
secret:
name: dev-secrets

workloadSecret:
name: workload-secrets

# Database credentials (still needs type/vaultName format for now)
mainDatabaseCredentialSecret:
type: kubernetes
name: postgres-credentials

# Domain for accessing services
domain: dev.localhost

# Network trust level
networkTrust: anyone

# Product-specific configuration
connect:
# Cloud-agnostic IAM: explicit ServiceAccount name
# For kind, no annotations needed (no cloud IAM integration)
serviceAccountName: dev-connect
# Storage buckets (not needed for local dev)

workbench:
serviceAccountName: dev-workbench
# sessionTolerations: [] # optional, for node taints

packageManager:
serviceAccountName: dev-packagemanager
# For kind, Package Manager can use the same storage as other products
# No special Azure Files configuration needed

chronicle:
serviceAccountName: dev-chronicle

flightdeck:
serviceAccountName: dev-home

# No gatewayRef needed for basic kind testing
# kind can use traditional Ingress resources instead of Gateway API

# No nfsEgressCIDR needed for local development
# Network policies can be disabled or simplified for kind
6 changes: 6 additions & 0 deletions python-pulumi/src/ptd/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -428,6 +428,12 @@ class WorkloadClusterConfig:
# After migration, set to False to let Helm manage CRDs going forward.
team_operator_skip_crds: bool = False

def __post_init__(self) -> None:
# No-op implementation makes super().__post_init__() safe to call from subclasses
# (e.g. AWSWorkloadClusterConfig) without requiring every intermediate class to guard
# against AttributeError when the MRO reaches this base.
pass


def load_workload_cluster_site_dict(
cluster_site_dict: dict[str, typing.Any],
Expand Down
24 changes: 24 additions & 0 deletions python-pulumi/src/ptd/aws_workload.py
Original file line number Diff line number Diff line change
Expand Up @@ -255,10 +255,29 @@ class AWSWorkloadClusterConfig(ptd.WorkloadClusterConfig):
additional_node_groups: dict[str, ptd.NodeGroupConfig] = dataclasses.field(default_factory=dict)
public_endpoint_access: bool = True
ebs_csi_addon_version: str = "v1.41.0-eksbuild.1"
pod_identity_agent_version: str | None = None
enable_pod_identity_agent: bool = False
enable_external_secrets_operator: bool = False
# Requires the workload secret (secret_name) to contain 'fs-dns-name' (FSx NFS endpoint) before
# `pulumi up` is run; a missing key causes a deploy-time error (dry runs warn instead).
# Security note: the storageClass pathPattern derives subdirectory paths from the
# nfs.io/storage-path PVC annotation, which is user-controlled. Any entity with PVC create
# permissions can supply arbitrary paths; restrict via OPA/Gatekeeper or a
# ValidatingWebhookConfiguration if cross-path access is a concern.
enable_nfs_subdir_provisioner: bool = False # PVCs must carry the nfs.io/storage-path annotation; the storageClass pathPattern uses it to derive subdirectory paths
enable_efs_csi_driver: bool = False
efs_config: ptd.EFSConfig | None = None
karpenter_config: KarpenterConfig | None = None

def __post_init__(self) -> None:
super().__post_init__()
if self.enable_external_secrets_operator and not self.enable_pod_identity_agent:
msg = (
"enable_external_secrets_operator requires enable_pod_identity_agent=True "
"(ClusterSecretStore uses no auth block and relies on Pod Identity for credentials)."
)
raise ValueError(msg)


@dataclasses.dataclass(frozen=True)
class AWSWorkloadClusterComponentConfig(ptd.WorkloadClusterComponentConfig):
Expand All @@ -268,6 +287,8 @@ class AWSWorkloadClusterComponentConfig(ptd.WorkloadClusterComponentConfig):
secret_store_csi_driver_aws_provider_version: str | None = "0.3.5" # noqa: S105
nvidia_device_plugin_version: str | None = "0.17.1"
karpenter_version: str | None = "1.6.0"
nfs_subdir_provisioner_version: str | None = "4.0.18"
external_secrets_operator_version: str | None = "0.10.7"


class AWSWorkload(ptd.workload.AbstractWorkload):
Expand Down Expand Up @@ -585,6 +606,9 @@ def ebs_csi_role_name(self) -> str:
def fsx_openzfs_role_name(self) -> str:
return f"aws-fsx-openzfs-csi-driver.{self.compound_name}.posit.team"

def external_secrets_role_name(self, release: str) -> str:
return f"external-secrets.{release}.{self.compound_name}.posit.team"

def cluster_home_role_name(self, release: str) -> str:
return f"home.{release}.{self.compound_name}.posit.team"

Expand Down
28 changes: 28 additions & 0 deletions python-pulumi/src/ptd/pulumi_resources/aws_eks_cluster.py
Original file line number Diff line number Diff line change
Expand Up @@ -1306,6 +1306,34 @@ def with_aws_secrets_store_csi_driver_provider(

return self

def with_pod_identity_agent(
self,
version: str | None = None,
) -> typing.Self:
"""
Add the EKS Pod Identity Agent addon.

This addon enables EKS Pod Identity for associating IAM roles with
Kubernetes service accounts without IRSA annotations. Pod Identity
associations are created separately via aws.eks.PodIdentityAssociation.

:param version: Optional, String, version of the addon to install.
By setting this to None, the latest version will be installed.
:return: self
"""
self.pod_identity_agent_addon = aws.eks.Addon(
f"{self.name}-eks-pod-identity-agent",
args=aws.eks.AddonArgs(
addon_name="eks-pod-identity-agent",
addon_version=version,
cluster_name=self.name,
tags=self.eks.tags,
),
opts=pulumi.ResourceOptions(parent=self.eks),
)

return self

def attach_efs_security_group(
self,
efs_file_system_id: str,
Expand Down
Loading
Loading