SEML - Slurm Experiment Management Library.
Usage:
$ seml [OPTIONS] COLLECTION COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...Arguments:
COLLECTION: The name of the database collection to use. [required]
Options:
--migration-skip: Skip the migration of the database collection.--migration-backup: Backup the database collection before migration.-v, --verbose: Whether to print debug messages.-V, --version: Print the version number.--install-completion: Install completion for the current shell.--show-completion: Show completion for the current shell, to copy it or customize the installation.--help: Show this message and exit.
Commands:
add: Add experiments to the database as defined...cancel: Cancel the Slurm job/job step...claim-experiment: Claim an experiment from the database.clean-db: Remove orphaned artifacts in the DB from...clean-jobs: Cancel empty pending jobs.configure: Configure SEML (database, argument...delete: Delete experiments by ID or state (cancels...description: Manage descriptions of the experiments in...detect-duplicates: Prints duplicate experiment configurations.detect-killed: Detect experiments where the corresponding...download-sources: Download source files from the database to...drop: Drop collections from the database.hold: Hold queued experiments via SLURM.launch-worker: Launch a local worker that runs PENDING jobs.list: Lists all collections in the database.prepare-experiment: Fetch experiment from database, prepare it...print-command: Print the commands that would be executed...print-experiment: Print the experiment document.print-fail-trace: Prints fail traces of all failed experiments.print-output: Print the output of experiments.project: Setting up new projects.queue: Prints the collections of the given job IDs.release: Release held experiments via SLURM.reload-sources: Reload stashed source files.reset: Reset the state of experiments by setting...start: Fetch staged experiments from the database...start-jupyter: Start a Jupyter slurm job.status: Report status of experiments in the...update-working-dir: Change the working directory of...
Add experiments to the database as defined in the configuration.
Usage:
$ seml add [OPTIONS] CONFIG_FILES...Arguments:
CONFIG_FILES...: Path to the YAML configuration file for the experiment. [required]
Options:
-nh, --no-hash: By default, we use the hash of the config dictionary to filter out duplicates (by comparing all dictionary values individually). Only disable this if you have a good reason as it is faster.-ncs, --no-sanity-check: Disable this if the check fails unexpectedly when using advanced Sacred features or to accelerate adding.-ncc, --no-code-checkpoint: Disable this if you want your experiments to use the current codeinstead of the code at the time of adding.-f, --force: Force adding the experiment even if it already exists in the database.-o, --overwrite-params DICT: Dictionary (passed as a string, e.g. '{"epochs": 100}') to overwrite parameters in the config.-d, --description TEXT: A description for the experiment.--no-resolve-descriptions: Whether to prevent using omegaconf to resolve experiment descriptions--help: Show this message and exit.
Cancel the Slurm job/job step corresponding to experiments, filtered by ID or state.
Usage:
$ seml cancel [OPTIONS]Options:
-id, --sacred-id INTEGER: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-fd, --filter-dict DICT: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]: List of states to filter the experiments by. If empty (""), all states are considered. [default: PENDING, RUNNING]-w, --wait: Wait until all jobs are properly cancelled.-y, --yes: Automatically confirm all dialogues with yes.--help: Show this message and exit.
Claim an experiment from the database.
Usage:
$ seml claim-experiment [OPTIONS] SACRED_IDS...Arguments:
SACRED_IDS...: Sacred IDs (_id in the database collection) of the experiments to claim. [required]
Options:
--help: Show this message and exit.
Remove orphaned artifacts in the DB from runs which have been deleted..
Usage:
$ seml clean-db [OPTIONS]Options:
-y, --yes: Automatically confirm all dialogues with yes.--help: Show this message and exit.
Cancel empty pending jobs.
Usage:
$ seml clean-jobs [OPTIONS] SACRED_IDS...Arguments:
SACRED_IDS...: Sacred IDs (_id in the database collection) of the experiments to claim. [required]
Options:
--help: Show this message and exit.
Configure SEML (database, argument completion, ...).
Usage:
$ seml configure [OPTIONS]Options:
--host TEXT: The host of the MongoDB server.--port INTEGER: The port of the MongoDB server.--database TEXT: The name of the MongoDB database to use.--username TEXT: The username for the MongoDB server.--password TEXT: The password for the MongoDB server.-sf, --ssh-forward: Configure SSH forwarding settings for MongoDB.--help: Show this message and exit.
Delete experiments by ID or state (cancels Slurm jobs first if not --no-cancel).
Usage:
$ seml delete [OPTIONS]Options:
-id, --sacred-id INTEGER: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]: List of states to filter the experiments by. If empty (""), all states are considered. [default: STAGED, QUEUED, FAILED, KILLED, INTERRUPTED]-fd, --filter-dict DICT: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-nc, --no-cancel: Do not cancel the experiments before deleting them.-y, --yes: Automatically confirm all dialogues with yes.--help: Show this message and exit.
Manage descriptions of the experiments in a collection.
Usage:
$ seml description [OPTIONS] COMMAND [ARGS]...Options:
--help: Show this message and exit.
Commands:
delete: Deletes the description of experiment(s).list: Lists the descriptions of all experiments.set: Sets the description of experiment(s).
Deletes the description of experiment(s).
Usage:
$ seml description delete [OPTIONS]Options:
-id, --sacred-id INTEGER: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]: List of states to filter the experiments by. If empty (""), all states are considered.-fd, --filter-dict DICT: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-y, --yes: Automatically confirm all dialogues with yes.--help: Show this message and exit.
Lists the descriptions of all experiments.
Usage:
$ seml description list [OPTIONS]Options:
-u, --update-status: Whether to update the status of experiments in the database. This can take a while for large collections. Use only if necessary.--help: Show this message and exit.
Sets the description of experiment(s).
Usage:
$ seml description set [OPTIONS] DESCRIPTIONArguments:
DESCRIPTION: The description to set. [required]
Options:
-id, --sacred-id INTEGER: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]: List of states to filter the experiments by. If empty (""), all states are considered.-fd, --filter-dict DICT: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-y, --yes: Automatically confirm all dialogues with yes.--no-resolve-descriptions: Whether to prevent using omegaconf to resolve experiment descriptions--help: Show this message and exit.
Prints duplicate experiment configurations.
Usage:
$ seml detect-duplicates [OPTIONS]Options:
-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]: List of states to filter the experiments by. If empty (""), all states are considered. [default: STAGED, QUEUED, FAILED, KILLED, INTERRUPTED]-fd, --filter-dict DICT: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.--help: Show this message and exit.
Detect experiments where the corresponding Slurm jobs were killed externally.
Usage:
$ seml detect-killed [OPTIONS]Options:
--help: Show this message and exit.
Download source files from the database to the provided path.
Usage:
$ seml download-sources [OPTIONS] TARGET_DIRECTORYArguments:
TARGET_DIRECTORY: The directory where the source files should be restored. [required]
Options:
-id, --sacred-id INTEGER: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]: List of states to filter the experiments by. If empty (""), all states are considered.-fd, --filter-dict DICT: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.--help: Show this message and exit.
Drop collections from the database.
Note: This is a dangerous operation and should only be used if you know what you are doing.
Usage:
$ seml drop [OPTIONS] [PATTERN]Arguments:
[PATTERN]: A regex that must match the collections to print. [default: .*]
Options:
-y, --yes: Automatically confirm all dialogues with yes.--help: Show this message and exit.
Hold queued experiments via SLURM.
Usage:
$ seml hold [OPTIONS]Options:
-b, --batch-id INTEGER: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.--help: Show this message and exit.
Launch a local worker that runs PENDING jobs.
Usage:
$ seml launch-worker [OPTIONS]Options:
-n, --num-experiments INTEGER: Number of experiments to start. 0: all (staged) experiments [default: 0]-nf, --no-file-output: Do not write the experiment's output to a file.-ss, --steal-slurm: Local jobs 'steal' from the Slurm queue, i.e. also execute experiments waiting for execution via Slurm.-pm, --post-mortem: Activate post-mortem debugging with pdb.-d, --debug: Run a single interactive experiment without Sacred observers and with post-mortem debugging. Implies--verbose --num-exps 1 --post-mortem --output-to-console.-ds, --debug-server: Run the experiment with a debug server, to which you can remotely connect with e.g. VS Code. Implies--debug.-o, --output-to-console: Write the experiment's output to the console.-wg, --worker-gpus TEXT: The IDs of the GPUs used by the local worker. Will be directly passed to CUDA_VISIBLE_DEVICES.-wc, --worker-cpus INTEGER: The number of CPUs used by the local worker. Will be directly passed to OMP_NUM_THREADS.-we, --worker-env DICT: Further environment variables to be set for the local worker.-id, --sacred-id INTEGER: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-fd, --filter-dict DICT: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.--help: Show this message and exit.
Lists all collections in the database.
Usage:
$ seml list [OPTIONS] [PATTERN]Arguments:
[PATTERN]: A regex that must match the collections to print. [default: .*]
Options:
-p, --progress: Whether to print a progress bar for iterating over collections.-u, --update-status: Whether to update the status of experiments in the database. This can take a while for large collections. Use only if necessary.-fd, --full-descriptions: Whether to print full descriptions (possibly with line breaks).--help: Show this message and exit.
Fetch experiment from database, prepare it and print the command to execute it.
Usage:
$ seml prepare-experiment [OPTIONS]Options:
-id, --sacred-id INTEGER: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters. [required]-v, --verbose: Whether to print debug messages.-u, --unobserved: Run the experiments without Sacred observers.-pm, --post-mortem: Activate post-mortem debugging with pdb.-ssd, --stored-sources-dir TEXT: Load source files into this directory before starting.-ds, --debug-server: Run the experiment with a debug server, to which you can remotely connect with e.g. VS Code. Implies--debug.--help: Show this message and exit.
Print the commands that would be executed by start.
Usage:
$ seml print-command [OPTIONS]Options:
-id, --sacred-id INTEGER: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]: List of states to filter the experiments by. If empty (""), all states are considered. [default: STAGED, QUEUED]-fd, --filter-dict DICT: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-n, --num-experiments INTEGER: Number of experiments to start. 0: all (staged) experiments [default: 0]-wg, --worker-gpus TEXT: The IDs of the GPUs used by the local worker. Will be directly passed to CUDA_VISIBLE_DEVICES.-wc, --worker-cpus INTEGER: The number of CPUs used by the local worker. Will be directly passed to OMP_NUM_THREADS.-we, --worker-env DICT: Further environment variables to be set for the local worker.--unresolved: Whether to print the unresolved command.--no-interpolation: Whether disable variable interpolation. Only compatible with --unresolved.--help: Show this message and exit.
Print the experiment document.
Usage:
$ seml print-experiment [OPTIONS]Options:
-id, --sacred-id INTEGER: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]: List of states to filter the experiments by. If empty (""), all states are considered. [default: PENDING, STAGED, QUEUED, RUNNING, FAILED, KILLED, INTERRUPTED, COMPLETED]-fd, --filter-dict DICT: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-p, --projection KEY: List of configuration keys, e.g.,config.model, to additionally print.-F, --format TEXT: The format in which to print the experiment document. [default: yaml]--help: Show this message and exit.
Prints fail traces of all failed experiments.
Usage:
$ seml print-fail-trace [OPTIONS]Options:
-id, --sacred-id INTEGER: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-fd, --filter-dict DICT: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]: List of states to filter the experiments by. If empty (""), all states are considered. [default: FAILED, KILLED, INTERRUPTED]-p, --projection KEY: List of configuration keys, e.g.,config.model, to additionally print.--help: Show this message and exit.
Print the output of experiments.
Usage:
$ seml print-output [OPTIONS]Options:
-id, --sacred-id INTEGER: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]: List of states to filter the experiments by. If empty (""), all states are considered. [default: RUNNING, FAILED, KILLED, INTERRUPTED, COMPLETED]-fd, --filter-dict DICT: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-sl, --slurm: Whether to print the Slurm output instead of the experiment output.-h, --head INTEGER: Print the first n lines of the output.-t, --tail INTEGER: Print the last n lines of the output.--help: Show this message and exit.
Setting up new projects.
Usage:
$ seml project [OPTIONS] COMMAND [ARGS]...Options:
--help: Show this message and exit.
Commands:
init: Initialize a new project in the given...list-templates: List available project templates.
Initialize a new project in the given directory.
Usage:
$ seml project init [OPTIONS] [DIRECTORY]Arguments:
[DIRECTORY]: The directory in which to initialize the project. [default: .]
Options:
-t, --template TEXT: The template to use for the project. To view available templates useseml project list-templates. [default: default]-n, --name TEXT: The name of the project. (By default inferred from the directory name.)-u, --username TEXT: The author name to use for the project. (By default inferred from $USER)-m, --usermail TEXT: The author email to use for the project. (By default empty.)-r, --git-remote TEXT: The git remote to use for the project. (By default SETTINGS.TEMPLATE_REMOTE.)-c, --git-commit TEXT: The exact git commit to use. May also be a tag or branch (By default latest)-y, --yes: Automatically confirm all dialogues with yes.--help: Show this message and exit.
List available project templates.
Usage:
$ seml project list-templates [OPTIONS]Options:
-r, --git-remote TEXT: The git remote to use for the project. (By default SETTINGS.TEMPLATE_REMOTE.)-c, --git-commit TEXT: The exact git commit to use. May also be a tag or branch (By default latest)--help: Show this message and exit.
Prints the collections of the given job IDs. If none is specified, all jobs are considered.
Usage:
$ seml queue [OPTIONS] [JOB_IDS]...Arguments:
[JOB_IDS]...: The job IDs of the experiments to get the collection for.
Options:
-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]: List of states to filter the experiments by. If empty (""), all states are considered. [default: PENDING, RUNNING]-a, --all: Whether to attempt finding the collection of the jobs of all users.-w, --watch: Whether to watch the queue.--help: Show this message and exit.
Release held experiments via SLURM.
Usage:
$ seml release [OPTIONS]Options:
-b, --batch-id INTEGER: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.--help: Show this message and exit.
Reload stashed source files.
Usage:
$ seml reload-sources [OPTIONS]Options:
-k, -keep-old: Keep the old source files in the database.-b, --batch-ids INTEGER: Batch IDs (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-y, --yes: Automatically confirm all dialogues with yes.--help: Show this message and exit.
Reset the state of experiments by setting their state to STAGED and cleaning their database entry. Does not cancel Slurm jobs.
Usage:
$ seml reset [OPTIONS]Options:
-id, --sacred-id INTEGER: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]: List of states to filter the experiments by. If empty (""), all states are considered. [default: FAILED, KILLED, INTERRUPTED]-fd, --filter-dict DICT: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-y, --yes: Automatically confirm all dialogues with yes.--help: Show this message and exit.
Fetch staged experiments from the database and run them (by default via Slurm).
Usage:
$ seml start [OPTIONS]Options:
-id, --sacred-id INTEGER: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-fd, --filter-dict DICT: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-d, --debug: Run a single interactive experiment without Sacred observers and with post-mortem debugging. Implies--verbose --num-exps 1 --post-mortem --output-to-console.-ds, --debug-server: Run the experiment with a debug server, to which you can remotely connect with e.g. VS Code. Implies--debug.-l, --local: Run the experiment locally instead of on a Slurm cluster.-nw, --no-worker: Do not launch a local worker after setting experiments' state to PENDING.-n, --num-experiments INTEGER: Number of experiments to start. 0: all (staged) experiments [default: 0]-nf, --no-file-output: Do not write the experiment's output to a file.-ss, --steal-slurm: Local jobs 'steal' from the Slurm queue, i.e. also execute experiments waiting for execution via Slurm.-pm, --post-mortem: Activate post-mortem debugging with pdb.-o, --output-to-console: Write the experiment's output to the console.-wg, --worker-gpus TEXT: The IDs of the GPUs used by the local worker. Will be directly passed to CUDA_VISIBLE_DEVICES.-wc, --worker-cpus INTEGER: The number of CPUs used by the local worker. Will be directly passed to OMP_NUM_THREADS.-we, --worker-env DICT: Further environment variables to be set for the local worker.--help: Show this message and exit.
Start a Jupyter slurm job. Uses SBATCH options defined in settings.py under SBATCH_OPTIONS_TEMPLATES.JUPYTER
Usage:
$ seml start-jupyter [OPTIONS]Options:
-l, --lab: Start a jupyter-lab instance instead of jupyter notebook.-c, --conda-env TEXT: Start the Jupyter instance in a Conda environment.-sb, --sbatch-options DICT: Dictionary (passed as a string, e.g. '{"gres": "gpu:2"}') to request two GPUs.--help: Show this message and exit.
Report status of experiments in the database collection.
Usage:
$ seml status [OPTIONS]Options:
-u, --update-status: Whether to update the status of experiments in the database. This can take a while for large collections. Use only if necessary. [default: True]-p, --projection KEY: List of configuration keys, e.g.,config.model, to additionally print.--help: Show this message and exit.
Change the working directory of experiments in case you moved the source code to a different location.
Usage:
$ seml update-working-dir [OPTIONS] WORKING_DIRArguments:
WORKING_DIR: The new working directory for the experiments. [required]
Options:
-b, --batch-ids INTEGER: Batch IDs (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.--help: Show this message and exit.