PhyloTree_SM

RNA-seq SNP based Phylogenetic Tree Pipeline, implemented as a Snakemake workflow

Links

Original Pipeline Implementation using Shell

Usage

1. Install Snakemake

Using either the Mamba (recommended) or Conda package manager, install Snakemake & Snakedeploy in an isolated environment:

mamba create -c conda-forge -c bioconda -n snakemake snakemake snakedeploy

Ensure that the newly created environment is activated for all following steps:

mamba activate snakemake

2. Deploy workflow

Create a working directory for this project and enter it for all following steps:

mkdir -p path/to/workdir
cd path/to/workdir

If you want to run the pipeline according to the main branch of this repository, run:

snakedeploy deploy-workflow https://github.com/liaoyjruby/PhyloTree_SM . --branch main

If you want to have all files locally, clone this repository into the working directory:

git clone https://github.com/liaoyjruby/PhyloTree_SM.git .

There are two main folders:

workflow: contains the Snakemake rule that implement the workflow
config: contains configuration files that should be edited according to needs

3. Configure workflow

General Settings:

Modify config.yaml as needed according to comments in the file.

Units & Samples Sheets:

units.tsv: Required columns Sample_ID and ID. Add column Mapped_Path with absolute paths if aligned BAM files are elsewhere.
samples.tsv: Sample annotation sheet with required column Sample_ID. Add columns with information about conditions of interest as desired.

The pipeline will include all samples listed in the units.tsv sheet in the final phylogenetic tree output.

If you have the aligned BAM files already and do not want to copy them into the working directory, place them into subdirectory mapped/ with name <Sample_ID>.bam.

If you have the VCF file + index, place them into subdirectory hcVCF/ with name <Sample_ID>.vcf.gz & <Sample_ID>.vcf.gz.tbi.

5. Run workflow

See DAG of pipeline jobs by running:

snakemake -c 1 dag --use-conda

After configuration, run the Snakemake workflow while deploying any necessary software in the process with:

snakemake -c all --use-conda

The main script Snakefile in the workflow subfolder will automatically be detected and executed.

Change all to desired number of cores to use to run the pipeline.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
config		config
workflow		workflow
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PhyloTree_SM

Links

Usage

1. Install Snakemake

2. Deploy workflow

3. Configure workflow

5. Run workflow

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PhyloTree_SM

Links

Usage

1. Install Snakemake

2. Deploy workflow

3. Configure workflow

5. Run workflow

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages