This project provides multiples customizable rules for the workflow manager Snakemake. These rules have built as python function and can be imported (with snakemake include) and parametrized (with argument of the function).
In this paragraph we will add a rule markDuplicates in your workflow.
-
Copy the rule in your workflow
rulesfolderworkflow_folder/ rules/ markDuplicates.smk Snakemake -
Add the rule in the list of imports
rules/all.smkAdd all.smk in rules (once by workflow)
workflow_folder/ rules/ all.smk markDuplicates.smk SnakemakeAdd the rule in all.smk (it contains all the rules to import)
... include: "markDuplicates.smk" ... -
Import all the wrapper functions in your
Snakefile... include: "rules/all.smk" ...
The function markDuplicates() is now accessibe in your workflow.
Add the function call and parameters in your code:
...
markDuplicates(
params_stringency="STRICT"
params_keep_outputs=True
)
...
All accessible parameters and their default values are presented in function
declaration of the rule in markDuplicates.smk. The main categories of these
parameters are:
* input_,
* output_,
* param_,
* snake_: for parameters related to the Snakemake element like wildcards
restrictions.
Keep in mind that input and ouput must be consistent in terms of wildcards
like with a standard rule.
You can provide software of the rule by one of this three ways:
-
The software folder is in
$PATH. -
The path of the software is in workflow configuration file:
... software_paths: picard: /home/user/bin/picard ...The name of the parameter is in bin_path argument of the
paramssection of the rule in themarkDuplicates.smk. -
You use conda environment with Snakemake (see option
--use-conda) and the environment of your rule is inenvsfolder:workflow_folder/ envs/ picard.yml rules/ all.smk markDuplicates.smk SnakemakeThe name of the environment file can be found in
condasection of the rule in themarkDuplicates.smk.
Optional in local and required with cluster submission.
Default value are stored in rule in resources section:
resources:
extra = "", # options added to cluster submission (example: "--qos=project_A")
mem = "8G", # maximum memory required
partition = "normal" # partition/queue name for job
threads: 1 # threads per job
Declare parameter usage in snakemake command line:
snakemake \
...
--cluster 'sbatch {resources.extra} --mem={resources.mem} --partition={resources.partition} --cpus-per-task={threads}'
Values can be changed by --set-threads and --set-resources in snakemake
command line.
snakemake \
...
--cluster 'sbatch {resources.extra} --mem={resources.mem} --partition={resources.partition} --cpus-per-task={threads}' \
--set-threads $RULE=$NUM \
--set-resources $RULE:$KEY=$VAL $RULE:$KEY=$VAL \
You can change default values in profile configuration file.
$PROFILE/config.yaml
cluster: sbatch
cluster: "sbatch {resources.extra} --mem={resources.mem} --partition={resources.partition} --cpus-per-task={threads}"
set-resources:
- $RULE:$KEY=$VAL
2019 Laboratoire d'Anatomo-Cytopathologie du CHU Toulouse