Skip to content

New Cluster Scripts#273

Draft
misiugodfrey wants to merge 5 commits intomainfrom
misiug/SpaceMicePOC
Draft

New Cluster Scripts#273
misiugodfrey wants to merge 5 commits intomainfrom
misiug/SpaceMicePOC

Conversation

@misiugodfrey
Copy link
Contributor

Cluster Benchmarking Infrastructure

This PR adds multi-node TPC-H benchmarking support in a new ccluster, including CPU and GPU variants, result validation, automated result posting, and sweep tooling.

New scripts

  • run-sweep.sh — Automates running launch-run.sh + post_results.py across multiple node/scale-factor combinations
  • run-presto-benchmarks.sh — Orchestrates the full benchmark lifecycle (setup, coordinator, workers, queries, results collection)
  • pull_ghcr_image.sh / enroot-decompress.sh — Pull container images from ghcr.io and save as .sqsh files, with transparent gzip/zstd decompression support
  • launch-gen-data.sh / gen-tpch-data.slurm — TPC-H data generation jobs
  • launch-analyze-tables.sh / run-analyze-tables.sh / run-analyze-tables.slurm — Hive table analysis jobs

Benchmark execution (functions.sh, launch-run.sh, run-presto-benchmarks.slurm)

  • NUMA-aware worker placement with optional --no-numa flag for older images
  • CPU benchmark mode (--cpu) — disables cuDF, one worker per node, no GPU allocation
  • Conditional CUDA_VISIBLE_DEVICES and NVIDIA_VISIBLE_DEVICES for GPU vs CPU runs
  • Writable coord_data and per-worker worker_data_N directories to avoid EROFS on read-only squashfs
  • Miniforge bind-mount so Python shebangs resolve inside the coordinator container
  • inject_benchmark_metadata — injects run context (image digest, engine, node/GPU counts, timestamp) into benchmark_result.json on exit
  • collect_results — copies configs and logs into result_dir for archival
  • Configurable nodelist, image names, and output path via CLI flags
  • Stale result prevention: result_dir and OUTPUT_DIR are fully removed (rm -rf) before each run so cancelled jobs cannot post old data

Config generation (generate_presto_config.sh, templates)

  • Per-worker config directories (etc_worker_N/) are now generated for both GPU and CPU variants
  • CPU variant explicitly sets cudf.enabled=false
  • Worker config template additions: async-data-cache-enabled=false, cudf.jit_expression_enabled=false, cudf.intra_node_exchange=true, cudf.concat_optimization_enabled, cudf.batch_size_min_threshold

Result reporting (post_results.py, validate_results.py)

  • Metadata is now read directly from benchmark_result.json context (no separate benchmark.json required)
  • Added node_count (cluster nodes) distinct from gpu_count (GPU workers/total workers)
  • CPU runs automatically report gpu_count=0 and gpu_name="N/A"
  • Per-query validation results attached to each query log entry; xfail status mapped to expected-failure for API compatibility
  • --velox-branch, --velox-repo, --presto-branch, --presto-repo args added to engine_config payload
  • identifier_hash falls back to image_digest from context if not provided on CLI
  • New validate_results.py for comparing query output against expected results

# We want to propagate any changes from the original worker config to the new worker configs even if
# we did not re-generate the configs.
if [[ -n "$NUM_WORKERS" && "$VARIANT_TYPE" == "gpu" ]]; then
if [[ -n "$NUM_WORKERS" && ( "$VARIANT_TYPE" == "gpu" || "$VARIANT_TYPE" == "cpu" ) ]]; then
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed because the cluster is going to replicate worker configs in a similar way to the gpu configs.

# Adds a cluster tag for cpu variant
echo "cluster-tag=native-cpu" >> ${COORD_CONFIG}
# Disable cuDF for CPU mode
sed -i 's/^cudf\.enabled=true/cudf.enabled=false/' ${WORKER_CONFIG}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the cluster we aren't using docker to control the gpu access, to we need to actively disable cudf, rather than letting it be disabled via the docker environment.

@misiugodfrey misiugodfrey marked this pull request as ready for review March 13, 2026 18:32
@misiugodfrey misiugodfrey marked this pull request as draft March 13, 2026 20:52
@misiugodfrey
Copy link
Contributor Author

Making this a draft as there are many portions of it that should be pulled out and are not slurm-specific (validation, benchmark posting, etc...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant