Focused, comprehensive examples for running MarsBench with Hugging Face–hosted datasets and (optionally) locally downloaded data (from Zenodo).
Sections:
- Base Example
- Command Builder
- Classification Tutorial
- Segmentation Tutorial
- Detection Tutorial
- Callbacks (and Logging / Monitoring)
- Advanced Patterns
- Batch Runs
NOTE:
- data_name is the value you pass via data_name=...
- HF Repo Slug column shows the suffix part used in repo_id="ORG/" (Organization/user prefix e.g. Mirali33 may differ)
- Model Name column lists all registered models for that task
- If any slug differs in your local version, open marsbench/data/init.py for the authoritative mapping
| Task Type | Dataset Name (data_name param) | HF Repo Slug (example) | Model Names (model_name) |
|---|---|---|---|
| classification | domars16k | mb-domars16k | resnet101, vit, swin_transformer, inceptionv3, squeezenet |
| classification | atmospheric_dust_classification_edr | mb-atmospheric_dust_cls_edr | resnet101, vit, swin_transformer, inceptionv3, squeezenet |
| classification | atmospheric_dust_classification_rdr | mb-atmospheric_dust_cls_rdr | resnet101, vit, swin_transformer, inceptionv3, squeezenet |
| classification | change_classification_ctx | mb-change_cls_ctx | resnet101, vit, swin_transformer, inceptionv3, squeezenet |
| classification | change_classification_hirise | mb-change_cls_hirise | resnet101, vit, swin_transformer, inceptionv3, squeezenet |
| classification | frost_classification | mb-frost_cls | resnet101, vit, swin_transformer, inceptionv3, squeezenet |
| classification | landmark_classification | mb-landmark_cls | resnet101, vit, swin_transformer, inceptionv3, squeezenet |
| classification | surface_classification | mb-surface_cls | resnet101, vit, swin_transformer, inceptionv3, squeezenet |
| classification | multi_label_mer | mb-surface_multi_label_cls | resnet101, vit, swin_transformer, inceptionv3, squeezenet |
| segmentation | boulder_segmentation | mb-boulder_seg | unet, deeplab, dpt, mask_rcnn, mask2former, segformer |
| segmentation | conequest_segmentation | mb-conequest_seg | unet, deeplab, dpt, mask_rcnn, mask2former, segformer |
| segmentation | crater_binary_segmentation | mb-crater_binary_seg | unet, deeplab, dpt, mask_rcnn, mask2former, segformer |
| segmentation | crater_multi_segmentation | mb-crater_multi_seg | unet, deeplab, dpt, mask_rcnn, mask2former, segformer |
| segmentation | mmls | mb-mmls | unet, deeplab, dpt, mask_rcnn, mask2former, segformer |
| segmentation | mars_seg_mer | mb-mars_seg_mer | unet, deeplab, dpt, mask_rcnn, mask2former, segformer |
| segmentation | mars_seg_msl | mb-mars_seg_msl | unet, deeplab, dpt, mask_rcnn, mask2former, segformer |
| segmentation | s5mars | mb-s5mars | unet, deeplab, dpt, mask_rcnn, mask2former, segformer |
| detection | boulder_detection | mb-boulder_det | fasterrcnn, retinanet, ssd |
| detection | conequest_detection | mb-conequest_det* | fasterrcnn, retinanet, ssd |
| detection | dust_devil_detection | mb-dust_devil_det | fasterrcnn, retinanet, ssd |
A minimal training run on a Hugging Face dataset:
python -m marsbench.main \
mode=train \
task=classification \
model_name=resnet101 \
data_name=domars16k \
load_from_hf=true \
repo_id="Mirali33/mb-domars16k" \
training.trainer.max_epochs=1Key points:
- mode=train (default if omitted)
- load_from_hf=true + repo_id="..." tells MarsBench to fetch the dataset via Hugging Face Hub.
- training.trainer.max_epochs=1 keeps it fast for a smoke test.
(If the dataset is already cached by datasets library, it will reuse it.)
MarsBench uses Hydra. Every CLI override is key=value. Nested config segments (e.g., training.trainer.max_epochs) descend into YAML structure.
General template:
python -m marsbench.main \
mode=<train|test|predict> \
task=<classification|segmentation|detection> \
model_name=<registered_model_key> \
data_name=<registered_dataset_key> \
[load_from_hf=true repo_id="HF_ORG/REPO"] \
[dataset_path=/absolute/or/relative/path] \
[checkpoint_path=path/to/checkpoint.ckpt] \
[output_path=custom/output/dir] \
[prediction_output_path=preds/out] \
training.trainer.max_epochs=E \
training.batch_size=B \
training.optimizer.lr=LR \
transforms=<transforms_config_name> \
callbacks.early_stopping.patience=10 \
logger.wandb.enabled=trueParameter sources:
- model_name: must exist in configs/model//
- data_name: must exist in configs/data//
- transforms: defined in configs/transforms/
- logger., callbacks., training.* are in respective config trees.
- Use +key=value to introduce keys not predefined (Hydra strict mode safeguard).
Quoting:
- Quote strings containing special characters or uppercase letters when in doubt: repo_id="Mirali33/mb-boulder_det"
Hydra multirun (launch several sweeps):
python -m marsbench.main -m \
task=classification model_name=resnet101 data_name=domars16k load_from_hf=true repo_id="Mirali33/mb-domars16k" \
training.optimizer.lr=0.001,0.0005 training.batch_size=32,64Outputs go into multirun/ timestamped folders.
Step-by-step from zero to evaluation.
Check mapping in marsbench/data/init.py (datasets like domars16k, atmospheric_dust_classification_edr, surface_classification, etc.). Example using atmospheric dust:
python -m marsbench.main \
mode=train \
task=classification \
model_name=resnet101 \
data_name=atmospheric_dust_classification_edr \
load_from_hf=true \
repo_id="Mirali33/mb-atmospheric_dust_cls_edr" \
training.trainer.max_epochs=3 \
training.batch_size=32- Download the archive (e.g., domars16k.zip) from Zenodo
- Unzip:
unzip domars16k.zip -d /data/mars/ - Ensure folder structure matches the expected pipeline
- Run without load_from_hf:
python -m marsbench.main \
task=classification \
model_name=resnet101 \
data_name=domars16k \
dataset_path=/data/mars/domars16k \
training.trainer.max_epochs=2NOTE: Do not keep .zip files compressed; the dataloader expects extracted directories.
python -m marsbench.main \
task=classification model_name=resnet101 data_name=domars16k \
load_from_hf=true repo_id="Mirali33/mb-domars16k" \
training.optimizer.name=AdamW \
training.optimizer.lr=0.0007 \
training.scheduler.name=cosine \
training.trainer.max_epochs=10 \
training.trainer.accumulate_grad_batches=2 \
training.trainer.precision=16python -m marsbench.main \
task=classification model_name=resnet101 data_name=domars16k \
load_from_hf=true repo_id="Mirali33/mb-domars16k" \
training.trainer.fast_dev_run=1OR subset:
python -m marsbench.main \
task=classification model_name=resnet101 data_name=domars16k \
load_from_hf=true repo_id="Mirali33/mb-domars16k" \
+data.subset=1000Either add:
... test_after_training=trueOr run separately:
python -m marsbench.main \
mode=test \
task=classification \
model_name=resnet101 \
data_name=domars16k \
checkpoint_path=outputs/classification/domars16k/resnet101/<RUN_ID>/checkpoints/best.ckptpython -m marsbench.main \
mode=predict \
task=classification \
model_name=resnet101 \
data_name=domars16k \
checkpoint_path=outputs/classification/domars16k/resnet101/<RUN_ID>/checkpoints/best.ckpt \
prediction_output_path=predictions/domars16k_resnet101python -m marsbench.main \
task=classification model_name=resnet101 data_name=domars16k \
load_from_hf=true repo_id="Mirali33/mb-domars16k" \
training.trainer.accelerator=gpu \
training.trainer.devices=2 \
training.trainer.strategy=ddpExample dataset names: boulder_segmentation, conequest_segmentation, crater_binary_segmentation, crater_multi_segmentation, mars_seg_mer, mars_seg_msl, s5mars, mmls
python -m marsbench.main \
task=segmentation \
model_name=unet \
data_name=conequest_segmentation \
load_from_hf=true \
repo_id="Mirali33/mb-conequest_seg" \
training.trainer.max_epochs=5If config exposes partition (e.g., partition=0.1 used earlier):
python -m marsbench.main \
task=segmentation model_name=unet data_name=boulder_segmentation \
load_from_hf=true repo_id="Mirali33/mb-boulder_seg" \
partition=0.1 \
training.trainer.max_epochs=2Check configs/transforms/ for segmentation-specific transforms (e.g., seg_default, heavy_aug). Example:
python -m marsbench.main \
task=segmentation model_name=unet data_name=cone_quest_segmentation \
load_from_hf=true repo_id="Mirali33/mb-conequest_seg" \
transforms=seg_default \
training.trainer.max_epochs=10python -m marsbench.main \
task=segmentation \
model_name=deeplab \
data_name=cone_quest_segmentation \
load_from_hf=true repo_id="Mirali33/mb-conequest_seg" \
training.optimizer.lr=0.0001 \
training.trainer.max_epochs=15python -m marsbench.main \
task=segmentation model_name=unet data_name=boulder_segmentation \
load_from_hf=true repo_id="Mirali33/mb-boulder_seg" \
training.trainer.precision=16 \
training.trainer.accumulate_grad_batches=4 \
training.trainer.max_epochs=20Dataset keys may include boulder_detection, conequest_detection, dust_devil_detection.
python -m marsbench.main \
task=detection \
model_name=ssd \
data_name=boulder_detection \
load_from_hf=true \
repo_id="Mirali33/mb-boulder_det" \
training.trainer.max_epochs=5python -m marsbench.main \
task=detection \
model_name=faster_rcnn \
data_name=boulder_detection \
load_from_hf=true repo_id="Mirali33/mb-boulder_det" \
training.optimizer.lr=0.0002 \
training.trainer.max_epochs=12python -m marsbench.main \
mode=test \
task=detection \
model_name=ssd \
data_name=boulder_detection \
checkpoint_path=outputs/detection/boulder_detection/ssd/<RUN_ID>/checkpoints/best.ckptMarsBench integrates PyTorch Lightning callbacks & loggers via configs/callbacks and configs/logger.
Common callbacks keys (exact keys depend on YAML):
- callbacks.early_stopping.*
- callbacks.best_checkpoint.* (model checkpoint)
- callbacks.lr_monitor.enabled
- callbacks.progress_bar.refresh_rate
python -m marsbench.main \
task=classification model_name=resnet101 data_name=domars16k \
load_from_hf=true repo_id="Mirali33/mb-domars16k" \
callbacks.early_stopping.monitor=val/accuracy \
callbacks.early_stopping.mode=max \
callbacks.early_stopping.patience=8python -m marsbench.main \
task=classification model_name=resnet101 data_name=domars16k \
load_from_hf=true repo_id="Mirali33/mb-domars16k" \
callbacks.best_checkpoint.save_top_k=3 \
callbacks.best_checkpoint.monitor=val/accuracy \
callbacks.best_checkpoint.mode=maxpython -m marsbench.main \
task=segmentation model_name=unet data_name=cone_quest_segmentation \
load_from_hf=true repo_id="Mirali33/mb-conequest_seg" \
callbacks.lr_monitor.enabled=truepython -m marsbench.main \
task=classification model_name=resnet101 data_name=domars16k \
load_from_hf=true repo_id="Mirali33/mb-domars16k" \
logger.wandb.enabled=true \
logger.wandb.project=MarsBench \
logger.wandb.name=domars16k_resnet101_testpython -m marsbench.main \
task=classification model_name=resnet101 data_name=domars16k \
load_from_hf=true repo_id="Mirali33/mb-domars16k" \
logger.tensorboard.enabled=true \
logger.csv.enabled=truepython -m marsbench.main \
task=classification model_name=resnet101 data_name=domars16k \
load_from_hf=true repo_id="Mirali33/mb-domars16k" \
output_path=experiments/domars16k_resnet101_custompython -m marsbench.main -m \
task=classification data_name=domars16k load_from_hf=true repo_id="Mirali33/mb-domars16k" \
model_name=resnet101,vit \
training.optimizer.lr=0.001,0.0005python -m marsbench.main \
task=classification model_name=resnet101 data_name=domars16k \
load_from_hf=true repo_id="Mirali33/mb-domars16k" \
seed=42| Scenario | Required Args |
|---|---|
| Hugging Face dataset | load_from_hf=true repo_id="ORG/REPO" |
| Local extracted folder | dataset_path=/path/to/folder (omit load_from_hf) |
| Zenodo zip just downloaded | MUST unzip first; then dataset_path=<unzipped_root> |
If both load_from_hf and dataset_path are given, precedence depends on dataset_path
- ModuleNotFound: run
pip install -e .from repo root. - CUDA OOM: reduce training.batch_size or use training.trainer.precision=16.
- Slow startup: first HF dataset download; subsequent runs cached.
- Permission issues on output: ensure outputs/ is writable or set output_path.
Purpose: Run many independent MarsBench configurations in parallel on an HPC using a SLURM job array. Each array index runs exactly one configuration (no Hydra -m multirun inside). This keeps logging isolated and makes failed reruns trivial.
Key ideas:
- Encode the Cartesian product of parameter lists into a COMBOS array.
- Use SLURM_ARRAY_TASK_ID to select one combo.
- Write one log (stdout+stderr) per array element via #SBATCH --output / --error.
- Trap errors to record failing indices.
Save as scripts/sweep_classification.sbatch (make executable: chmod +x scripts/sweep_classification.sbatch).
#!/bin/bash
#SBATCH --job-name=marsbench_hf
#SBATCH --array=0-10 # TEMP placeholder; will be reset after combo count is known
#SBATCH --time=00:10:00
#SBATCH --cpus-per-task=4
#SBATCH --gres=gpu:1
#SBATCH -p general
#SBATCH -q public
#SBATCH -A grp_hkerner
#SBATCH --mem=16G
#SBATCH --output=./outputs/hpc/slurm-%A_%a.out
#SBATCH --error=./outputs/hpc/slurm-%A_%a.out
set -euo pipefail
# --- User environment (EDIT) ---
module load mamba
source activate kerner_lab
mkdir -p outputs/hpc
# Parameter lists (EDIT freely)
DATASETS=(domars16k atmospheric_dust_classification_edr atmospheric_dust_classification_rdr change_classification_ctx change_classification_hirise frost_classification landmark_classification surface_classification)
MODELS=(resnet101 vit swin_transformer inceptionv3 squeezenet)
HF_FLAG_VAL=(True False)
BATCH_SIZES=(32)
SEEDS=(42)
# Build combos (Cartesian product)
COMBOS=()
for d in "${DATASETS[@]}"; do
for m in "${MODELS[@]}"; do
for hf in "${HF_FLAG_VAL[@]}"; do
for bs in "${BATCH_SIZES[@]}"; do
for seed in "${SEEDS[@]}"; do
COMBOS+=("${m},${d},${hf},${bs},${seed}")
done
done
done
done
done
TOTAL=${#COMBOS[@]}
# Optional: Add print-total helper
if [[ "${1:-}" == "--print-total" ]]; then
echo "Total combinations: ${#COMBOS[@]}"
exit 0
fi
# Extract combo
IFS=',' read MODEL DATASET HF_FLAG BS SEED <<< "${COMBOS[$SLURM_ARRAY_TASK_ID]}"
RUN_TAG="cls_${DATASET}_${MODEL}_bs${BS}_seed${SEED}_hf${HF_FLAG}"
FAILED_LOG="outputs/hpc/failed_runs.txt"
# On failure, append a single line with index and params
trap 'echo "idx=${SLURM_ARRAY_TASK_ID} model=${MODEL} dataset=${DATASET} bs=${BS} seed=${SEED} hf=${HF_FLAG} " >> "${FAILED_LOG}"' ERR
echo "=== Starting ${RUN_TAG} (array index ${SLURM_ARRAY_TASK_ID}/${TOTAL}) ==="
python -m marsbench.main \
task=classification \
model_name="${MODEL}" \
data_name="${DATASET}" \
load_from_hf=${HF_FLAG} \
training.batch_size="${BS}" \
seed="${SEED}" \
training.trainer.max_epochs=1 \
logger.csv.enabled=true \
logger.tensorboard.enabled=false
echo "=== Finished ${RUN_TAG} ==="
(Or temporarily add: echo "TOTAL=$TOTAL"; exit 0 just after computing TOTAL.) Then set --array=0-(TOTAL-1).
After job finishes:
sort -u status/failed_indices.txt > status/failed_indices.unique.txt
FAILED=$(paste -sd, status/failed_indices.unique.txt)
if [[ -n "$FAILED" ]]; then
sbatch --array=${FAILED} scripts/sweep_classification.sbatch
fiKey differences:
- task=segmentation
- data_name=cone_quest_segmentation
- models subset (unet deeplab)
- maybe partition=0.2 for quick tuning
Command section replacement:
python -m marsbench.main \
task=segmentation \
model_name="${MODEL}" \
data_name=cone_quest_segmentation \
load_from_hf=true \
repo_id="Mirali33/mb-conequest_seg" \
partition=0.2 \
training.optimizer.lr="${LR}" \
training.batch_size="${BS}" \
seed="${SEED}" \
training.trainer.max_epochs=20 \
output_path="${OUT_DIR}" \
callbacks.best_checkpoint.save_top_k=1Classification (HF):
python -m marsbench.main task=classification model_name=resnet101 data_name=domars16k load_from_hf=true repo_id="Mirali33/mb-domars16k"Segmentation (HF):
python -m marsbench.main task=segmentation model_name=unet data_name=conequest_segmentation load_from_hf=true repo_id="Mirali33/mb-conequest_seg"Detection (HF):
python -m marsbench.main task=detection model_name=ssd data_name=boulder_detection load_from_hf=true repo_id="Mirali33/mb-boulder_det"Local (after unzip):
python -m marsbench.main task=classification model_name=resnet101 data_name=domars16k dataset_path=/data/mars/domars16kAdd early stopping:
... callbacks.early_stopping.patience=5 callbacks.early_stopping.monitor=val/accuracy callbacks.early_stopping.mode=maxEnable WandB:
... logger.wandb.enabled=true logger.wandb.project=MarsBench