GitHub - BiCroLab/nextflow-gpseq: Nextflow pipeline for processing of GPSeq data

Nextflow pipeline for processing of GPSeq data

Genomic loci positioning by sequencing (GPSeq): genome-wide method for inferring distances to the nuclear lamina all along the nuclear radius.

Getting Started

The whole pipeline can be cloned in your current working directory with:

git clone https://github.com/BiCroLab/nextflow-gpseq
cd nextflow-gpseq

Requirements

conda create -n nextflow -c conda-forge -c bioconda nextflow=23.10.0 singularity=3.8*

nextflow (tested onversion 23.10 or higher)
singularity (tested on version 3.8.6)

To test the pipeline with default settings and a test dataset: nextflow run main.nf -profile test

Running the pipeline with your own dataset

Make a samplesheet.csv containing the following columns: sample,fastq,barcode,condition.
The file should be comma separated and sorted by digestion time-point in seconds.

Check and adjust required settings in nextflow.config and igenomes.config.

Run the pipeline with the following command:

nextflow run main.nf --samplesheet /path/to/samplesheet.csv --outdir /path/to/results --fasta /path/to/reference.fa --bwt2index /path/to/bowtie2/index/folder -resume

General Settings:

Setting	Description
`enzyme` / `cutsite`	enzyme name (e.g. `DpnII`) and recognition motif (e.g. `GATC`)
`binsizes`	list of comma-separated binsizes for calculating GPSeq scores; formula: `window:smoothing`: `"1e+05:1e+05,1e+05:1e+04,5e+04:5e+04"`
`normalization`	data normalization by library / chromosome (`"lib"` / `"chrom"`) `"lib"`: each experiment is normalized based on its total read count `"chrom"`: normalization is applied independently for each chromosome
`site_domain`	param affecting which restriction sites are considered for computing score default value is `"universe"`; also see: `"separate"`/`"union"` for a detailed explanation about all existing `site_domain` settings, please refer to the supplementary information of our GPSeq paper.
`score_outlier_tag`	setting to filter out outliers (default: `"iqr:1.5"`)
`bed_outlier_tag`	setting to filter out outliers (default: `"chisq:0.01"`)

Genome Settings:

Genome-related settings can be modified from igenomes.config:

Setting	Description
fasta	path to reference `genome.fa` genome sequence
fasta_index	path to corresponding `genome.fa.fai` genome index
bowtie2	path to directory containing bowtie2 index
mask_bed	path to bed file containing regions to be masked

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
assets		assets
bin		bin
conf		conf
lib		lib
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config
nextflow_schema.json		nextflow_schema.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nextflow pipeline for processing of GPSeq data

Getting Started

Requirements

Running the pipeline with your own dataset

General Settings:

Genome Settings:

Outputs

About

Uh oh!

Releases 1

Packages

Languages

License

BiCroLab/nextflow-gpseq

Folders and files

Latest commit

History

Repository files navigation

Nextflow pipeline for processing of GPSeq data

Getting Started

Requirements

Running the pipeline with your own dataset

General Settings:

Genome Settings:

Outputs

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages