Skip to content

MilanParikh/infercnv_workflow

Repository files navigation

infercnv_workflow

Terra workflow for inferCNV with cell caching

Automation Script Jupyter Notebook

This notebook/script helps you set up the input files and bash scripts to run the R version of inferCNV on Terra from a Python (ScanPy) environment

Instructions contained within the comments show where to configure paths, Google buckets, Terra information, your email, etc.

There are two options:

  1. Per Sample (Recommended) - This will split up each of your samples into it's own parallel Terra run, which takes less time, requires less resources, and costs much less
  2. All Samples - This runs your whole count matrix at one time, this can take days, requires very high resources for larger matrices, and can cost quite a bit.

Docker

Dockerfile - contains the R and Python setup required for version 1.8.1 of inferCNV, with a modified inferCNV.R script that enables the up_to_step flag

(Already built here: https://hub.docker.com/repository/docker/mparikhbroad/infercnv_caching and referenced in the WDL)

WDL

The workflow is set up with 6 roughly evenly timed stages using the "up_to_step" flag in order to enable cell caching. This allows for a long running inferCNV process to not lose too much progress if preempted.

(Already uploaded here: https://portal.firecloud.org/?return=terra#methods/mparikh/infercnv_resumable/3)

At the end of each step the outputs are compressed into a .tar.gz archive and then the follow step will untar and resume

infercnv_resumable.wdl is the newest/best version to use, it allows for resuming from an intermediate step using the .tar.gz archive saved from previous steps.

About

Terra workflow for inferCNV

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors