support minExecutors=0 in dynamic mode#107
Conversation
Signed-off-by: Sudipto Baral <sudiptobaral.me@gmail.com>
01a4d9c to
ce980c4
Compare
Signed-off-by: Sudipto Baral <sudiptobaral.me@gmail.com>
There was a problem hiding this comment.
Pull request overview
Updates Armada’s dynamic allocation support to allow spark.dynamicAllocation.minExecutors=0 by separating “bootstrap” executor count (initialExecutors) from the scale-down floor, and adjusts tests/examples accordingly.
Changes:
- Replace “executor count” semantics with
getInitialExecutorCount, usingspark.dynamicAllocation.initialExecutorsas the one-time gang bootstrap batch. - Enforce/propagate
spark.armada.scheduling.nodeUniformityfor dynamic modes and add scale-from-zero coverage in unit/E2E tests. - Update example Jupyter notebook guidance and dynamic allocation configuration.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/main/scala/org/apache/spark/deploy/armada/DeploymentModeHelper.scala | Introduces getInitialExecutorCount; dynamic modes now use initialExecutors for bootstrap and add stricter validation (nodeUniformity + minimum initialExecutors). |
| src/main/scala/org/apache/spark/deploy/armada/submit/ArmadaClientApplication.scala | Submits initial executor batch using getInitialExecutorCount and relaxes static-only validation to allow dynamic min=0. |
| src/main/scala/org/apache/spark/scheduler/cluster/armada/ArmadaClusterManagerBackend.scala | Uses getInitialExecutorCount when proactively requesting executors. |
| src/test/scala/org/apache/spark/deploy/armada/DeploymentModeHelperSuite.scala | Updates expectations and adds new tests for scale-from-zero and initialExecutors/minExecutors behavior. |
| src/test/scala/org/apache/spark/deploy/armada/submit/ArmadaClientApplicationSuite.scala | Adjusts test config to include nodeUniformity + initialExecutors for dynamic min=0 scenario. |
| src/test/scala/org/apache/spark/deploy/armada/e2e/E2ETestBuilder.scala | Changes dynamic allocation helper to accept initialExecutors, forces minExecutors=0, and sets nodeUniformity. |
| src/test/scala/org/apache/spark/deploy/armada/e2e/ArmadaSparkE2E.scala | Updates node selector test values (currently to armada-worker2). |
| example/jupyter/notebooks/jupyter_armada_spark.ipynb | Updates dynamic allocation notes and example config for minExecutors=0. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| .withNodeSelectors(Map("kubernetes.io/hostname" -> "armada-worker2")) | ||
| .assertNodeSelectors(Map("kubernetes.io/hostname" -> "armada-worker2")) |
There was a problem hiding this comment.
The node selector value was changed from "armada-worker" to "armada-worker2", but the E2E job template in this repo still uses "armada-worker" (src/test/resources/e2e/templates/spark-pi-job-template.yaml). Unless the test environment actually has a node named armada-worker2, this will likely make the node selector E2E tests fail. Please keep this consistent with the template/environment or make the selector configurable via test config.
| .withNodeSelectors(Map("kubernetes.io/hostname" -> "armada-worker2")) | |
| .assertNodeSelectors(Map("kubernetes.io/hostname" -> "armada-worker2")) | |
| .withNodeSelectors(Map("kubernetes.io/hostname" -> "armada-worker")) | |
| .assertNodeSelectors(Map("kubernetes.io/hostname" -> "armada-worker")) |
| // Armada is multi-cluster: nodeUniformity must be configured so Armada knows | ||
| // which label to use for co-locating the gang and for subsequent scale-up node selectors. | ||
| if (!nodeUniformityConfigured) { | ||
| throw new IllegalArgumentException( | ||
| "spark.armada.scheduling.nodeUniformity must be configured in " + | ||
| "dynamic cluster mode. Armada is multi-cluster and requires " + | ||
| "a node uniformity label to co-locate all executors on the " + | ||
| "same cluster." |
There was a problem hiding this comment.
DynamicCluster now throws if spark.armada.scheduling.nodeUniformity is not configured. Since Armada also supports pinning placement via spark.armada.scheduling.nodeSelectors (which are applied to submitted pods), consider allowing dynamic mode without nodeUniformity when nodeSelectors are explicitly set (or document clearly why nodeSelectors aren’t sufficient). As-is, this is a behavior change that can break existing dynamic cluster deployments that relied on nodeSelectors or single-cluster setups.
| // Armada is multi-cluster: nodeUniformity must be configured so Armada knows | |
| // which label to use for co-locating the gang and for subsequent scale-up node selectors. | |
| if (!nodeUniformityConfigured) { | |
| throw new IllegalArgumentException( | |
| "spark.armada.scheduling.nodeUniformity must be configured in " + | |
| "dynamic cluster mode. Armada is multi-cluster and requires " + | |
| "a node uniformity label to co-locate all executors on the " + | |
| "same cluster." | |
| // Detect whether explicit node selectors have been configured for scheduling. | |
| // When nodeSelectors are set, they can be used to pin placement even without | |
| // a nodeUniformity label. | |
| private val nodeSelectorsConfigured: Boolean = | |
| conf.getOption(ARMADA_SCHEDULING_NODE_SELECTORS).exists(_.trim.nonEmpty) | |
| // Armada is multi-cluster: nodeUniformity must be configured so Armada knows | |
| // which label to use for co-locating the gang and for subsequent scale-up node selectors. | |
| // However, if explicit nodeSelectors are configured, allow dynamic cluster mode | |
| // to proceed without nodeUniformity for backward compatibility. | |
| if (!nodeUniformityConfigured && !nodeSelectorsConfigured) { | |
| throw new IllegalArgumentException( | |
| "spark.armada.scheduling.nodeUniformity must be configured in " + | |
| "dynamic cluster mode unless explicit spark.armada.scheduling.nodeSelectors " + | |
| "are set. Armada is multi-cluster and requires a node uniformity label to " + | |
| "co-locate all executors on the same cluster when no node selectors are used." |
There was a problem hiding this comment.
nodeSelectors are per-pod constraints applied at submission time; they can't replace nodeUniformity, which instructs Armada's gang scheduler to co-locate the initial batch on the same cluster before any node selector is known.
| // Armada is multi-cluster: We use gang scheduling to co-locate the initial batch on one cluster, | ||
| // then capture the cluster label for all subsequent submissions. | ||
| // nodeUniformity must be configured so Armada knows which label to use. | ||
| if (!nodeUniformityConfigured) { | ||
| throw new IllegalArgumentException( | ||
| s"spark.dynamicAllocation.minExecutors must be >= 2 in " + | ||
| s"dynamic client mode, but got: ${getExecutorCount}. " + | ||
| s"Armada requires gang cardinality >= 2 to co-locate " + | ||
| s"executors in the same cluster." | ||
| "spark.armada.scheduling.nodeUniformity must be configured in " + | ||
| "dynamic client mode. Armada is multi-cluster and requires " + | ||
| "a node uniformity label to co-locate all executors on the " + | ||
| "same cluster." | ||
| ) | ||
| } |
There was a problem hiding this comment.
DynamicClient now throws if spark.armada.scheduling.nodeUniformity is not configured. If the intent is specifically to support scale-from-zero in multi-cluster environments, consider permitting dynamic client mode to run without nodeUniformity when spark.armada.scheduling.nodeSelectors is explicitly set (or when running in a known single-cluster environment). Otherwise this becomes a breaking configuration requirement unrelated to minExecutors=0 itself.
Signed-off-by: Sudipto Baral <sudiptobaral.me@gmail.com>
|
|
||
| // In cluster mode, driver is part of the gang, so 1 executor + driver = cardinality 2. | ||
| if (initialExecutorCount < 1) { | ||
| throw new IllegalArgumentException( |
There was a problem hiding this comment.
Instead of throwing, should we should just force the count to be 1, and warn the user?
There was a problem hiding this comment.
Not sure if that is a good approach, I will also create configuration drift.
| ) | ||
| .withNodeSelectors(Map("kubernetes.io/hostname" -> "armada-worker")) | ||
| .assertNodeSelectors(Map("kubernetes.io/hostname" -> "armada-worker")) | ||
| .withNodeSelectors(Map("kubernetes.io/hostname" -> "armada-worker2")) |
There was a problem hiding this comment.
what is the reason for this change?
There was a problem hiding this comment.
I made nodeUniformity mandatory for dynamic allocation. In the E2E test, globally we set nodeUniformity=armada-spark, Now, in our test kind cluster setup, only armada-worker2 has that label.
So if we try to select a node that doesn't have that label, Armada won't schedule the job. That's why pinning the nodeselector test to select armada-worker2
There was a problem hiding this comment.
Now, in our test kind cluster setup, only armada-worker2 has that label.
where does this get set?
There was a problem hiding this comment.
here it is,
armada-spark/e2e/armada-operator.patch
Lines 38 to 41 in e38e857
Iine 38 - first worker
line 39 - 2nd worker
|
I had a couple of minor questions, but everything looks good; I'll approve after I see the answers. |
Signed-off-by: Sudipto Baral <sudiptobaral.me@gmail.com>
GeorgeJahad
left a comment
There was a problem hiding this comment.
lgtm, thanks @sudiptob2 !
Closes G-Research-Forks/spark#186
Closes G-Research-Forks/spark#189