-
Notifications
You must be signed in to change notification settings - Fork 1
Description
We can manually configure a pipeline to use job profiling via slurm. However for newer users a --profile option or backend setting to force job profiling can be helpful. Especially if we automatically generate plots and potentially job efficiency metrics.
For the slurm_singularity backend we generate an sbatch script:
jetstream/jetstream/backends/slurm_singularity.py
Lines 675 to 680 in eca552c
| sbatch_script += f"[[ -v SINGULARITY_CACHEDIR ]] || SINGULARITY_CACHEDIR=$HOME/.singularity/cache\n" | |
| sbatch_script += f"if ls $SINGULARITY_CACHEDIR/oci-tmp | grep {singularity_image_digest} > /dev/null ; then\n" | |
| sbatch_script += f" {singularity_run_env_vars}{singularity_executable} exec {singularity_exec_args}{singularity_hostname_arg}{singularity_mounts_string} $SINGULARITY_CACHEDIR/oci-tmp/{singularity_image_digest} bash {cmd_script_filename}\n" | |
| sbatch_script += f"else\n" | |
| sbatch_script += f" {singularity_run_env_vars}{singularity_executable} exec {singularity_exec_args}{singularity_hostname_arg}{singularity_mounts_string} {singularity_image} bash {cmd_script_filename}\n" | |
| sbatch_script += f"fi\n" |
Pseudo implementation:
if profile:
sbatch_args.extend(['--profile=ltask', '--acctg-freq=task=1'])
if profile:
sbatch_script += f"sbatch -n1 -d$SLURM_JOB_ID --wrap="sh5util -j $SLURM_JOB_ID"\n"
sbatch_script += f"[[ -v SINGULARITY_CACHEDIR ]] || SINGULARITY_CACHEDIR=$HOME/.singularity/cache\n"
sbatch_script += f"if ls $SINGULARITY_CACHEDIR/oci-tmp | grep {singularity_image_digest} > /dev/null ; then\n"
sbatch_script += f" {singularity_run_env_vars}{singularity_executable} exec {singularity_exec_args}{singularity_hostname_arg}{singularity_mounts_string} $SINGULARITY_CACHEDIR/oci-tmp/{singularity_image_digest} bash {cmd_script_filename}\n"
sbatch_script += f"else\n"
sbatch_script += f" {singularity_run_env_vars}{singularity_executable} exec {singularity_exec_args}{singularity_hostname_arg}{singularity_mounts_string} {singularity_image} bash {cmd_script_filename}\n"
sbatch_script += f"fi\n"Decisions for if we should generate a plot might be more complex than we expect. Initial thoughts suggest that we should generates plot when the task completes. However if a job fails due to memory or walltime, we would likely want the plot in this case as well.
We may also want to consider psrecord as a general option, should only need to insert the following (assuming psrecord is available already, other alternatives may exist):
sbatch_script += f" psrecord {singularity_run_env_vars}{singularity_executable} exec ...