test: add bare-metal custom S3 dataplane values test case#253
test: add bare-metal custom S3 dataplane values test case#253
Conversation
Adds a test case for a bare-metal deployment using custom S3-compatible storage (stow), with prometheus disabled, flytepropeller disabled (executor mode), and custom namespace mapping. Intentionally preserves values as-is to surface issues during review. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Current Aviator status
This pull request is currently open (not queued). How to mergeTo merge this PR, comment
See the real-time status of this PR on the
Aviator webapp.
Use the Aviator Chrome Extension
to see the status of your PR within GitHub.
|
mhotan
left a comment
There was a problem hiding this comment.
Deployment Issue Analysis
Review of the generated manifest at tests/generated/dataplane.baremetal-custom-s3.yaml reveals these issues:
🔴 HIGH — imageBuilder.defaultRepository has https:// protocol prefix
Values line: defaultRepository: "https://ghcr.io/acme-corp/acme/union"
Container registries expect a bare hostname/path, not a URL. BuildKit will fail to push images because https://ghcr.io/... is not a valid image reference.
Fix: defaultRepository: "ghcr.io/acme-corp/acme/union"
🔴 HIGH — storage.region not overridden at top level
The custom stow config has region: RNO2A, but the chart's top-level storage.region defaults to us-east-1. The operator's clusterData.bucketRegion resolves to the top-level value:
bucketRegion: 'us-east-1' # ← WRONG, should be RNO2AFix: Add region: RNO2A under the top-level storage: key.
🔴 HIGH — Duplicate container: key in rendered storage config
The storage template emits container: from storage.bucketName, then the custom provider block re-emits it from storage.custom.container:
storage:
container: "union" # from storage.bucketName
container: union # from storage.custom ← DUPLICATEThis is invalid YAML. Most parsers silently take the last value, but strict parsers will reject it.
Fix: Remove container: union from storage.custom (it's already provided by storage.bucketName).
🟡 MEDIUM — prometheus.enabled: false but OpenCost still enabled
OpenCost defaults to enabled: true and is not disabled in these values. The rendered manifest creates a full OpenCost deployment pointing at a non-existent Prometheus service:
- name: PROMETHEUS_SERVER_ENDPOINT
value: "http://union-operator-prometheus.union.svc:80/prometheus"OpenCost pods will crashloop.
Fix: Add opencost: { enabled: false }
🟡 MEDIUM — Operator references non-existent Prometheus service
Even with OpenCost fixed, the operator's dependenciesHeartbeat.prometheus.endpoint and the global PROMETHEUS_SERVICE_URL env var still reference http://union-operator-prometheus:80 which does not exist. This affects all 5 deployments (executor, webhook, proxy, operator, syncresources).
Fix: Override config.operator.dependenciesHeartbeat.prometheus to remove/disable, or accept operator will log connection errors.
🟡 MEDIUM — AWS IAM role annotations on a bare-metal cluster
The default userRoleAnnotationKey (eks.amazonaws.com/role-arn) and userRoleAnnotationValue (arn:aws:iam::ACCOUNT_ID:role/flyte_project_role) are rendered into cluster resource sync templates. On bare-metal these are meaningless.
Fix: Add userRoleAnnotationKey: "" and userRoleAnnotationValue: ""
🟢 LOW — monitoring.enabled and cost.enabled default to true
Creates orphaned ServiceMonitor resources with no Prometheus to scrape them.
Fix: Add monitoring: { enabled: false } and cost: { enabled: false }
🟢 LOW — Security: S3 credentials in ConfigMaps
The stow access key and secret key appear in plain text in 4 ConfigMap data entries. ConfigMaps are not encrypted at rest even with K8s secret encryption.
🐛 BUG (chart template) — namespace_config.yaml not processed through tpl
In charts/dataplane/templates/clusterresourcesync/configmap.yaml, the namespace_config.yaml key uses toYaml without tpl, causing Helm backtick escapes to appear literally in the output. Compare with namespace_mapping.yaml which correctly uses tpl. This is a chart bug, not a values bug.
🤖 Generated with Claude Code
Summary
dataplane.baremetal-custom-s3) for a bare-metal deployment using custom S3-compatible storage (stow provider)prometheus,flytepropeller,fluentbit,namespacesexecutor(replaces propeller),imageBuilder+buildkit, customnamespace_mappingKnown issues in these values
See review comments for details. A follow-up PR will address fixes.
Test plan
make generate-expectedpassesmake helm-testpasses🤖 Generated with Claude Code
main