|
| 1 | + |
| 2 | + |
| 3 | +# RAIN - RNA Alterations Investigation using Nextflow |
| 4 | + |
| 5 | +RAIN is a Nextflow workflow designed for epitranscriptomic analyses, enabling the detection of RNA modifications in comparison to a reference genome. |
| 6 | +Its primary goal is to distinguish true RNA editing events from genomic variants such as SNPs, with a particular emphasis on identifying A-to-I (Adenosine-to-Inosine) editing. |
| 7 | + |
| 8 | +<img src="doc/img/IRD.png" width="300" height="100" /> <img src="doc/img/MIVEGEC.png" width="150" height="100" /> |
| 9 | + |
| 10 | +<img src="doc/img/baargin_flowchart.jpg" width="900" height="500" /> |
| 11 | + |
| 12 | +## Table of Contents |
| 13 | + |
| 14 | + * [Foreword](#foreword) |
| 15 | + * [Flowchart](#flowchart) |
| 16 | + * [Installation](#installation) |
| 17 | + * [Nextflow](#nextflow) |
| 18 | + * [Container platform](#container-platform) |
| 19 | + * [Docker](#docker) |
| 20 | + * [Singularity](#singularity) |
| 21 | + * [Usage and test](#usage) |
| 22 | + * [Parameters](#parameters) |
| 23 | + * [Output](#output) |
| 24 | + * [Author](#author-and-contributors) |
| 25 | + * [Contributing](#contributing) |
| 26 | + |
| 27 | + |
| 28 | +## Foreword |
| 29 | + |
| 30 | +... |
| 31 | + |
| 32 | +## Flowchart |
| 33 | + |
| 34 | +... |
| 35 | + |
| 36 | +## Installation |
| 37 | + |
| 38 | +The prerequisites to run the pipeline are: |
| 39 | + |
| 40 | + * [Nextflow](https://www.nextflow.io/) >= 22.04.0 |
| 41 | + * [Docker](https://www.docker.com) or [Singularity](https://sylabs.io/singularity/) |
| 42 | + |
| 43 | +### Nextflow |
| 44 | + |
| 45 | + * Via conda |
| 46 | + |
| 47 | + <details> |
| 48 | + <summary>See here</summary> |
| 49 | + |
| 50 | + ```bash |
| 51 | + conda create -n nextflow |
| 52 | + conda activate nextflow |
| 53 | + conda install bioconda::nextflow |
| 54 | + ``` |
| 55 | + </details> |
| 56 | + |
| 57 | + * Manually |
| 58 | + <details> |
| 59 | + <summary>See here</summary> |
| 60 | + Nextflow runs on most POSIX systems (Linux, macOS, etc) and can typically be installed by running these commands: |
| 61 | + |
| 62 | + ```bash |
| 63 | + # Make sure 11 or later is installed on your computer by using the command: |
| 64 | + java -version |
| 65 | + |
| 66 | + # Install Nextflow by entering this command in your terminal(it creates a file nextflow in the current dir): |
| 67 | + curl -s https://get.nextflow.io | bash |
| 68 | + |
| 69 | + # Add Nextflow binary to your user's PATH: |
| 70 | + mv nextflow ~/bin/ |
| 71 | + # OR system-wide installation: |
| 72 | + # sudo mv nextflow /usr/local/bin |
| 73 | + ``` |
| 74 | + </details> |
| 75 | + |
| 76 | +### Container platform |
| 77 | + |
| 78 | +To run the workflow you will need a container platform: docker or singularity. |
| 79 | + |
| 80 | +### Docker |
| 81 | + |
| 82 | +Please follow the instructions at the [Docker website](https://docs.docker.com/desktop/) |
| 83 | + |
| 84 | +### Singularity |
| 85 | + |
| 86 | +Please follow the instructions at the [Singularity website](https://docs.sylabs.io/guides/latest/admin-guide/installation.html) |
| 87 | + |
| 88 | +## Usage |
| 89 | + |
| 90 | +### Help |
| 91 | + |
| 92 | +You can first check the available options and parameters by running: |
| 93 | + |
| 94 | +```bash |
| 95 | +nextflow run Juke34/RAIN -r v1.5.0 --help |
| 96 | +``` |
| 97 | + |
| 98 | +### Profile |
| 99 | + |
| 100 | +To run the workflow you must select a profile according to the container platform you want to use: |
| 101 | +- `singularity`, a profile using Singularity to run the containers |
| 102 | +- `docker`, a profile using Docker to run the containers |
| 103 | + |
| 104 | +The command will look like that: |
| 105 | + |
| 106 | +```bash |
| 107 | +nextflow run Juke34/RAIN -r vX.X.X -profile docker <rest of paramaters> |
| 108 | +``` |
| 109 | + |
| 110 | +Another profile is available (/!\\ actually not yet implemented): |
| 111 | + |
| 112 | +- `slurm`, to add if your system has a slurm executor (local by default) |
| 113 | + |
| 114 | +The use of the `slurm` profile will give a command like this one: |
| 115 | + |
| 116 | +```bash |
| 117 | +nextflow run Juke34/RAIN -r vX.X.X -profile singularity,slurm <rest of paramaters> |
| 118 | +``` |
| 119 | + |
| 120 | +### Test |
| 121 | + |
| 122 | +With nextflow and docker available you can run (where vX.X.X is the release version you wish to use): |
| 123 | + |
| 124 | +```bash |
| 125 | +nextflow run -profile docker,test Juke34/RAIN -r vX.X.X |
| 126 | +``` |
| 127 | + |
| 128 | +Or via a clone of the repository: |
| 129 | + |
| 130 | +``` |
| 131 | +git clone https://github.com/Juke34/rain.git |
| 132 | +cd rain |
| 133 | +nextflow run -profile docker,test rain.nf |
| 134 | +``` |
| 135 | + |
| 136 | +## Parameters |
| 137 | + |
| 138 | +``` |
| 139 | +RAIN - RNA Alterations Investigation using Nextflow - v0.1 |
| 140 | +
|
| 141 | + Usage example: |
| 142 | + nextflow run rain.nf -profile docker --genome /path/to/genome.fa --annotation /path/to/annotation.gff3 --reads /path/to/reads_folder --output /path/to/output --aligner hisat2 |
| 143 | +
|
| 144 | + Parameters: |
| 145 | + --help Prints the help section |
| 146 | +
|
| 147 | + Input sequences: |
| 148 | + --annotation Path to the annotation file (GFF or GTF) |
| 149 | + --reads path to the reads file, folder or csv. If a folder is provided, all the files with proper extension in the folder will be used. You can provide remote files (commma separated list). |
| 150 | + file extension expected : <.fastq.gz>, <.fq.gz>, <.fastq>, <.fq> or <.bam>. |
| 151 | + for paired reads extra <_R1_001> or <_R2_001> is expected where <R> and <_001> are optional. e.g. <sample_id_1.fastq.gz>, <sample_id_R1.fastq.gz>, <sample_id_R1_001.fastq.gz>) |
| 152 | + csv input expects 6 columns: sample, fastq_1, fastq_2, strandedness and read_type. |
| 153 | + fastq_2 is optional and can be empty. Strandedness, read_type expects same values as corresponding RAIN parameter; If a value is provided via RAIN paramter, it will override the value in the csv file. |
| 154 | + Example of csv file: |
| 155 | + sample,fastq_1,fastq_2,strandedness,read_type |
| 156 | + control1,path/to/data1.fastq.bam,,auto,short_single |
| 157 | + control2,path/to/data2_R1.fastq.gz,path/to/data2_R2.fastq.gz,auto,short_paired |
| 158 | + --genome Path to the reference genome in FASTA format. |
| 159 | + --read_type Type of reads among this list [short_paired, short_single, pacbio, ont] (no default) |
| 160 | +
|
| 161 | + Output: |
| 162 | + --output Path to the output directory (default: result) |
| 163 | +
|
| 164 | + Optional input: |
| 165 | + --aligner Aligner to use [default: hisat2] |
| 166 | + --edit_site_tool Tool used for detecting edited sites. Default: reditools3 |
| 167 | + --strandedness Set the strandedness for all your input reads (default: null). In auto mode salmon will guess the library type for each fastq sample. [ 'U', 'IU', 'MU', 'OU', 'ISF', 'ISR', 'MSF', 'MSR', 'OSF', 'OSR', 'auto' ] |
| 168 | + --edit_threshold Minimal number of edited reads to count a site as edited (default: 1) |
| 169 | + --aggregation_mode Mode for aggregating edition counts mapped on genomic features. See documentation for details. Options are: "all" (default) or "cds_longest" |
| 170 | + --clipoverlap Clip overlapping sequences in read pairs to avoid double counting. (default: false) |
| 171 | +
|
| 172 | + Nextflow options: |
| 173 | + -profile Change the profile of nextflow both the engine and executor more details on github README [debug, test, itrop, singularity, local, docker] |
| 174 | +``` |
| 175 | + |
| 176 | +## Output |
| 177 | + |
| 178 | +Here the description of typical ouput you will get from RAIN: |
| 179 | + |
| 180 | +``` |
| 181 | +└── rain_results # Output folder set using --outdir. Default: <alignment_results> |
| 182 | + │ |
| 183 | + ├── AliNe # Folder containing AliNe alignment pipeline result (see https://github.com/Juke34/AliNe) |
| 184 | + │ ├── alignment # bam alignment used by RAIN |
| 185 | + │ ├── salmon_strandedness # strandedness collected by AliNe in case auto mode was in used for fastq files |
| 186 | + │ └── ... |
| 187 | + │ |
| 188 | + ├── bam_indicies # bam and indices bam.bai |
| 189 | + │ |
| 190 | + ├── FastQC # bam and indices bam.bai |
| 191 | + │ |
| 192 | + ├── gatk_markduplicates # metrics and bam after markduplicates |
| 193 | + │ |
| 194 | + └── Reditools2/Reditools3/Jacusa/sapin/ # Editing output from corresponding tool |
| 195 | + │ |
| 196 | + └── feature_edits # Editing computed at different level (genomic features, chromosome, global) |
| 197 | +
|
| 198 | +## Author and contributors |
| 199 | +
|
| 200 | +Eduardo Ascarrunz (@eascarrunz) |
| 201 | +Jacques Dainat (@Juke34) |
| 202 | +
|
| 203 | +## Contributing |
| 204 | +
|
| 205 | +Contributions from the community are welcome ! See the [Contributing guidelines](https://github.com/Juke34/rain/blob/main/CONTRIBUTING.md) |
0 commit comments