A python package with container support for Toil pipelines.
Check the example! This package was built to support the 🍪 cookiecutter-toil repository. Singularity versions that have been tested against:
- Singularity 2.4.2
- Singularity 2.6.1
-
📦 Easy Installation
pip install toil_container -
🐳 Container System Calls
docker_callandsingularity_callare functions that run containerized commands with the same calling signature. Be default theexit codeis returned, however you can get thestdoutwithcheck_output=True. You can also set theenv,cwd,volumesandworking_dirfor the container call.working_diris used as the/tmpdirectory inside the container.from toil_container import docker_call from toil_container import singularity_call cmd = ["cowsay", "hello world"] status = docker_call("docker/whalesay", cmd) output = docker_call("docker/whalesay", cmd, check_output=True) status = singularity_call("docker://docker/whalesay", cmd) output = singularity_call("docker://docker/whalesay", cmd, check_output=True)
-
🛳 Container Job Class
ContainerJobis a Toil Job Class with acallmethod that executes commands with eitherDocker,SingularityorSubprocessdepending on image availability. Check out this simple whalesay example! The Job must be constructed with anoptionsargument of the typeargparse.Namespacethat may have the following attributes:attribute action description options.dockeruse docker name or path to image options.singularityuse singularity name or path to image options.workDirset as container /tmppath to work directory options.volumesvolumes to be mounted list of src, dst tuples -
🔌 Extended LSF functionality
By running with
--batchSystem custom_lsf, it provides 2 features:- Allows to pass its own
runtime (int)to each job in LSF using-W. - Automatic retry of the job by doubling the initial runtime, if the job is killed by
TERM_RUNLIMIT.
Additionally, it provides an optimization to cache running jobs status from calling all current jobs (
bjobs) once, instead of one by one.NOTE: The original
toil.Jobclass, doesn't provide an option to setruntimeper job. You could only set a wall runtime globally by adding-W <runtime>inTOIL_LSF_ARGS. (see: BD2KGenomics/toil#2065). Please note that our hack, encodes theruntimerequirements in the job'sunitName, so your log files will have a longer name. Let us know if you need more custom parameters or if you know of a better solution 😄 .You can set a default runtime in minutes with environment variableTOIL_CONTAINER_RUNTIME. Configurecustom_lsfwith the following environment variables:ContainerJoboption description TOIL_CONTAINER_RUNTIME set a default runtime in minutes TOIL_CONTAINER_RETRY_MEM retry memory in integer GB (default "60") TOIL_CONTAINER_RETRY_RUNTIME retry runtime in integer minutes (default "40000") TOIL_CONTAINER_RUNTIME_FLAG bsub runtime flag (default "-W") TOIL_CONTAINER_LSF_PER_CORE 'Y' if lsf resources are per core, and not per job - Allows to pass its own
-
📘 Container Parser With Short Toil Options
ContainerArgumentParseradds the--docker,--singularityand--volumesarguments to the options namespace. This parser only prints the required toil arguments when using--help. However, the full list of toil rocketry is printed with--help-toil. If you don't need the container options but want to use--help-toiluseToilShortArgumentParser.whalesay.py --help-container usage: whalesay [-h] [-v] [--help-toil] [TOIL OPTIONAL ARGS] jobStore optional arguments: -h, --help show this help message and exit --help-toil show help with toil arguments and exit --help-container show help with container arguments and exit container arguments: --docker name/path of the docker image available in daemon --singularity name/path of the singularity image available in deamon --volumes tuples of (local path, absolute container path) toil arguments: TOIL OPTIONAL ARGS see --help-toil for a full list of toil parameters jobStore the location of the job store for the workflow [REQUIRED]
whalesay.py is an example that runs a toil pipeline with the famous whalesay docker container. The pipeline can now be executed with either docker, singularity or subprocess.
# whalesay.py
from toil_container import ContainerJob
from toil_container import ContainerArgumentParser
class WhaleSayJob(ContainerJob):
def run(self, fileStore):
"""Run `cowsay` with Docker, Singularity or Subprocess."""
msg = self.call(["cowsay", self.options.msg], check_output=True)
fileStore.logToMaster(msg)
def main():
parser = ContainerArgumentParser()
parser.add_argument("-m", "--msg", default="Hello from the ocean!")
options = parser.parse_args()
job = WhaleSayJob(options=options)
ContainerJob.Runner.startToil(job, options)
if __name__ == "__main__":
main()Then run:
# run with docker
whalesay.py jobstore -m 'hello world' --docker docker/whalesay
# run with singularity
whalesay.py jobstore -m 'hello world' --singularity docker://docker/whalesay
# if cowsay is available in the environment
whalesay.py jobstore -m 'hello world'If you want to convert a docker image into a singularity image instead of using the docker:// prefix, check docker2singularity, and use -m '/shared-fs-path /shared-fs-path' to make sure your shared file system is mounted inside the singularity image.
Contributions are welcome, and they are greatly appreciated, check our contributing guidelines! Make sure you add your name to the contributors list:
- 🐋 Juan S. Medina @jsmedmar
- 🐴 Juan E. Arango @juanesarango
- 🐒 Max F. Levine @mflevine
- 🐼 Joe Zhou @zhouyangyu
- This repo was inspired by toil's implementation of a
Docker Calland toil_vg interface ofSingularity Calls. - This package was initiated with Cookiecutter and the audreyr/cookiecutter-pypackage project template.