dhslab/chromhmm is a bioinformatics pipeline that ...
Runs Learn Model, Make Segmentation, and/or Overlap Enrichment on provided data and models.
First, prepare a samplesheet with your input data that looks as follows:
samplesheet.csv:
id,sample,mark,file
sample1_H3K27ac_1,sample1,H3K27ac,sample1_H3K27ac_1.narrowPeak
sample1_H3K27ac_2,sample1,H3K27ac,sample1_H3K27ac_2.narrowPeak
sample1_H3K27me3_1,sample1,H3K27me3,sample1_H3K27me3_1.sorted.bam
sample1_H3K27me3_2,sample1,H3K27me3,sample1_H3K27me3_2.sorted.bam
sample1_H3K36me3_1,sample1,H3K36me3,sample1_H3K36me3_1.sorted.bam
sample1_H3K36me3_2,sample1,H3K36me3,sample1_H3K36me3_2.sorted.bam
sample1_H3K4me_1,sample1,H3K4me,sample1_H3K4me_1.narrowPeak
sample1_H3K4me_2,sample1,H3K4me,sample1_H3K4me_2.narrowPeak
sample1_H3K4me3_1,sample1,H3K4me3,sample1_H3K4me3_1.narrowPeak
sample1_H3K4me3_2,sample1,H3K4me3,sample1_H3K4me3_2.narrowPeak
sample1_H3K9me3_1,sample1,H3K9me3,sample1_H3K9me3_1.narrowPeak
sample1_H3K9me3_2,sample1,H3K9me3,sample1_H3K9me3_2.narrowPeak
sample1_wgbs,sample1,methylation,sample1-wgbs.meth.bed.gz
The accepted file formats for the marks are: .narrowPeak, .bam, .bed.gz
Now, you can run the pipeline using:
nextflow run dhslab/nf-chromhmm \
-profile ris \
--samplesheet samplesheet.csv \
--outdir <OUTDIR> \
--make_segmentation/--learn_modelAdditional inputs:
--regions: /path/to/regions
--models: /path/to/models
--states: list of number of states
--beds : /path/to/bed.csv
If multiple states or regions are provided, enter them as a comma-separated list. Examples are in conf folder.
To run overlapenrichment, you need to provide a csv with a bed file with the regions of interest per sample. The sample value should match to the sample value in the samplesheet.
sample,bed
sample_1,/path/to/regions/sample_1.bed
sample_2,/path/to/regions/sample_2.bed
> [!WARNING]
> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; see [docs](https://nf-co.re/docs/usage/getting_started/configuration#custom-configuration-files).
## Credits
dhslab/chromhmm was originally written by Nidhi.
We thank the following people for their extensive assistance in the development of this pipeline:
<!-- TODO nf-core: If applicable, make list of people who have also contributed -->
## Contributions and Support
If you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md).
## Citations
<!-- TODO nf-core: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file. -->
<!-- If you use dhslab/chromhmm for your analysis, please cite it using the following doi: [10.5281/zenodo.XXXXXX](https://doi.org/10.5281/zenodo.XXXXXX) -->
<!-- TODO nf-core: Add bibliography of tools and data used in your pipeline -->
An extensive list of references for the tools used by the pipeline can be found in the [`CITATIONS.md`](CITATIONS.md) file.
This pipeline uses code and infrastructure developed and maintained by the [nf-core](https://nf-co.re) community, reused here under the [MIT license](https://github.com/nf-core/tools/blob/main/LICENSE).
> **The nf-core framework for community-curated bioinformatics pipelines.**
>
> Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
>
> _Nat Biotechnol._ 2020 Feb 13. doi: [10.1038/s41587-020-0439-x](https://dx.doi.org/10.1038/s41587-020-0439-x).