krakenuniq_workflow

Snakemake pipeline for metagenomic classification with krakenuniq. It runs per-sample classification against a Krakenuniq database and produces compressed classification tables plus plain-text reports.

Repository layout

workflow/Snakefile - main Snakemake workflow
workflow/rules/krakenuniq.smk - Krakenuniq rules
dataset_example/config/config.yml - example dataset configuration
dataset_example/config/units.tsv - example dataset sample sheet
run_dataset.sh - run locally
run_dataset_slurm.sh - submit to SLURM

Requirements

snakemake (9+ recommended)
krakenuniq
Python with pandas

Configuration

Create a dataset directory with a config/ folder that contains:

config/config.yml
config/units.tsv

An example dataset lives at dataset_example/.

Example config/config.yml (pairs with the units.tsv example below):

prefix: demo
out_dir: results
units: config/units.tsv
krakenuniq:
  db: /path/to/krakenuniq/db
  threads: 8

config/units.tsv columns:

unit_id: sample identifier used for output sub-folder
unit_prefix: library/run identifier used in output filenames
fq: path to input FASTQ file

Example:

unit_id	unit_prefix	fq
SAMPLE01	SAMPLE01_L001	/data/fastq/SAMPLE01_L001.fq.gz
SAMPLE01	SAMPLE01_L002	/data/fastq/SAMPLE01_L002.fq.gz
SAMPLE02	SAMPLE02_L001	/data/fastq/SAMPLE02_L001.fq.gz

Run locally

From the repo root:

bash run_dataset.sh /path/to/dataset
bash run_dataset.sh dataset_example

You can pass extra Snakemake args after the dataset path:

bash run_dataset.sh /path/to/dataset -- --cores 8 --rerun-incomplete

Run on SLURM

bash run_dataset_slurm.sh /path/to/dataset --jobs 50 --partition general
bash run_dataset_slurm.sh dataset_example --dry-run

See full options:

bash run_dataset_slurm.sh --help

Outputs

For each unit_id and unit_prefix, outputs are written under ${out_dir}/{unit_id}/:

classify/ - compressed Krakenuniq classification tables
report/ - Krakenuniq report tables (plus an aggregated report per unit)
log/ - Krakenuniq logs
stages/ - completion stamps created by the top-level target

Example output structure:

${out_dir}/
├── SAMPLE01/
│   ├── classify/
│   │   ├── SAMPLE01_L001.demo.krakenuniq_class.tsv.gz
│   │   └── SAMPLE01_L002.demo.krakenuniq_class.tsv.gz
│   ├── report/
│   │   ├── SAMPLE01_L001.demo.krakenuniq_report.tsv
│   │   ├── SAMPLE01_L002.demo.krakenuniq_report.tsv
│   │   └── SAMPLE01.demo.krakenuniq_report_aggregated.tsv
│   ├── log/
│   │   ├── SAMPLE01_L001.demo.classify.log
│   │   └── SAMPLE01_L002.demo.classify.log
│   └── stages/
│       ├── SAMPLE01_L001.all.done
│       └── SAMPLE01_L002.all.done
└── SAMPLE02/
    ├── classify/
    │   └── SAMPLE02_L001.demo.krakenuniq_class.tsv.gz
    ├── report/
    │   ├── SAMPLE02_L001.demo.krakenuniq_report.tsv
    │   └── SAMPLE02.demo.krakenuniq_report_aggregated.tsv
    ├── log/
    │   └── SAMPLE02_L001.demo.classify.log
    └── stages/
        └── SAMPLE02_L001.all.done

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

krakenuniq_workflow

Repository layout

Requirements

Configuration

Run locally

Run on SLURM

Outputs

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
dataset_example/config		dataset_example/config
workflow		workflow
README.md		README.md
run_dataset.sh		run_dataset.sh
run_dataset_slurm.sh		run_dataset_slurm.sh

Folders and files

Latest commit

History

Repository files navigation

krakenuniq_workflow

Repository layout

Requirements

Configuration

Run locally

Run on SLURM

Outputs

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages