chore(recipe): bump dynamo-platform from 0.9.x to 1.0.1#459
chore(recipe): bump dynamo-platform from 0.9.x to 1.0.1#459Jont828 wants to merge 4 commits intoNVIDIA:mainfrom
Conversation
Upgrade Dynamo platform to the latest 1.0.1 release across registry and all inference overlay recipes. Key changes for the 1.0 schema: - Registry: defaultVersion 0.9.1 → 1.0.1 - Overlays: all 5 dynamo overlays updated from 0.9.0 → 1.0.1 - Values: rewritten for 1.0 Helm schema (global.* subchart controls, upgradeCRD: true, removed stale image pins and kube-rbac-proxy workaround fixed upstream) Signed-off-by: Jont828 <jt572@cornell.edu>
|
/ok to test 852739e |
|
@dims Merged changes from main, can we get another CI run? |
yuanchen8911
left a comment
There was a problem hiding this comment.
Review
Clean, well-scoped change (+31/-34, 7 files). The 0.9.x → 1.0.1 migration and Helm schema rewrite look correct overall, with one significant issue:
High: Grove dropped from deployment without external replacement
The old values had grove: enabled: true at the top level, deploying grove as a subchart of dynamo-platform. The new values set global.grove.install: false + global.grove.enabled: true, which tells the Dynamo operator "grove is installed externally" — but AICR has no standalone grove component:
- No
groveentry exists inrecipes/registry.yaml(onlynodeSchedulingpaths referencing the subchart at L376/379) - No Dynamo overlay lists grove as a
dependencyRef
This effectively removes grove from the bundle, which is a behavioral regression for multinode/PodClique-based Dynamo workloads. The upstream 1.0.1 docs confirm global.grove.enabled: true only enables integration with an already-installed grove — it does not deploy it.
Fix options:
- Add a standalone
grovecomponent toregistry.yamland the Dynamo overlaydependencyRefs, or - Set
global.grove.install: trueto keep deploying grove as a subchart
Minor observations
- Registry/overlay versions now aligned — registry was
0.9.1while overlays pinned0.9.0; both are1.0.1now. Good. dynamo-crds+upgradeCRD: true— overlays still havedependencyRefs: [dynamo-crds]. Confirm the separate CRD chart and platform chart'supgradeCRDdon't cause CRD ownership conflicts.- KWOK e2e and cluster deploy unchecked — given this is a schema rewrite, recommend completing before merge.
Summary
recipes/components/dynamo-platform/values.yamlfor the 1.0 Helm schema:global.*subchart controls,upgradeCRD: true, removed stale image pin and kube-rbac-proxy workaround (fixed upstream)dynamo-crdsversion intentionally unchanged (no 1.0 CRD chart exists; platform chart now bundles CRDs viaupgradeCRD)Test plan
go test -race ./pkg/recipe/... -count=1— passesgo test -race ./pkg/bundler/... -count=1— passesmake test— all tests pass, coverage 72%+make lint(golangci-lint + yamllint on changed files) — cleanmake kwok-e2e RECIPE=h100-eks-ubuntu-inference-dynamo)dynamo-platform1.0.1 chart renders correctly withglobal.*keys🤖 Generated with Claude Code