add history server and "--files" support#106
add history server and "--files" support#106GeorgeJahad merged 29 commits intoarmadaproject:masterfrom
Conversation
95a1da9 to
4f9b47f
Compare
There was a problem hiding this comment.
Pull request overview
Adds first-class support for Spark’s distributed resource flags (--files, --jars, --archives) when submitting to Armada (including “submit-in-driver” behavior), and introduces scripts to enable S3-backed Spark event logging plus running a Spark History Server against those logs.
Changes:
- Propagate file-upload-related system properties from Spark K8s feature steps into the SparkConf used for Armada driver/executor submission.
- Extend SparkSubmit option assignment and archive-download behavior to include the ARMADA cluster manager for multiple Spark versions (3.3/3.5/4.1).
- Add/init scripts to configure S3 credentials + Spark event logging and provide History Server +
--filesusage examples.
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
src/main/scala/org/apache/spark/deploy/armada/submit/ArmadaClientApplication.scala |
Adds driver feature-step system properties propagation and local app resource resolution logic. |
src/main/scala-spark-3.3/org/apache/spark/deploy/SparkSubmit.scala |
Enables ARMADA for --files/--archives/--jars and driver-side resource download behavior. |
src/main/scala-spark-3.5/org/apache/spark/deploy/SparkSubmit.scala |
Same as above for Spark 3.5. |
src/main/scala-spark-4.1/org/apache/spark/deploy/SparkSubmit.scala |
Same as above for Spark 4.1. |
src/test/scala/org/apache/spark/deploy/armada/submit/ArmadaClientApplicationSuite.scala |
Adds tests for new resource-resolution behavior and updates fixtures for new config field. |
src/test/scala/org/apache/spark/deploy/armada/e2e/ArmadaSparkE2E.scala |
Adjusts python e2e spark conf (removes upload-path setting). |
scripts/init.sh |
Adds S3_CONF and EVENT_LOG_CONF helpers for S3 credentials + event logging. |
scripts/submitArmadaSpark.sh |
Injects S3/event-log config into standard Armada submit script. |
scripts/runHistoryServer.sh |
New script to run Spark History Server against S3 event logs with basic error detection. |
scripts/filesParameterExample.sh |
New example demonstrating --files distribution to executors. |
.gitignore |
Ignores generated example/files directory from the example script. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
|
|
||
| /** URI schemes for remote storage — files with these schemes are not local filesystem paths. */ | ||
| private val remoteSchemes = | ||
| Set("s3a", "s3", "hdfs", "gs", "wasb", "wasbs", "abfs", "abfss", "http", "https", "ftp") |
There was a problem hiding this comment.
Couldn't you say everything that is ! isLocalFile is remote?
There was a problem hiding this comment.
thanks I simplified this
sudiptob2
left a comment
There was a problem hiding this comment.
Looks good, a few minor comments.
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
Signed-off-by: George Jahad <github@blackbirdsystems.net>
00554e9 to
2d41721
Compare
Signed-off-by: George Jahad <github@blackbirdsystems.net>
|
thanks for the reviews @sudiptob2 and @EnricoMi |
What: Add
--filessupport for Armada cluster-mode submissions and enable automatic S3-backed event logging for Spark History Server.Why: Armada's cluster-mode driver needs to resolve locally-submitted files (via
--files) to their uploaded remote paths so executors can access them. Without this, file distribution silently fails. Event logging to S3 enables post-hoc job inspection via Spark History Server.Changes:
resolveLocalAppResourceto map local app resources to their feature-step-uploaded remote URIs (e.g., S3) using the--classindex position in container argsapplyFileUploadPropertiesto propagate resolvedspark.jars,spark.files,spark.archives,spark.submit.pyFilesfrom driver feature steps into SparkConfspark.kubernetes.submitInDriver=trueso the driver pod downloads remote files locally on startupisArmadaClusterModeDriverand| ARMADAtoOptionAssignerentries for files/archives/jars in all three version-specificSparkSubmit.scalafiles (3.3, 3.5, 4.1)S3_CONFandEVENT_LOG_CONFarrays toinit.shfor automatic S3 credential and event log configurationscripts/runHistoryServer.shto launch a Spark History Server reading event logs from S3, with error detection if the log directory doesn't existscripts/filesParameterExample.shdemonstrating--filesparameter with UUID-tagged CSV distributed to executorsEVENT_LOG_CONFandS3_CONFtosubmitArmadaSpark.shso jobs automatically capture event logs when S3 is configuredTests:
resolveLocalAppResourcetests covering local/remote paths, S3 resolution, missing--class, and missing containerdriverSystemPropertiesfrom feature steps are populated inArmadaJobConfigHow to verify:
mvn test -DwildcardSuites=org.apache.spark.deploy.armada.submit.ArmadaClientApplicationSuite— all tests passscripts/filesParameterExample.sh— executors print UUID from distributed CSV filescripts/submitArmadaSpark.sh -p 100— Pi job completes, event logs written to S3scripts/runHistoryServer.sh— History Server UI athttp://localhost:18080shows completed jobs