Skip to content

Commit daade52

Browse files
authored
Merge pull request #32 from Juke34/syncaline
Syncaline
2 parents 0f89fcc + efe8729 commit daade52

9 files changed

Lines changed: 608 additions & 175 deletions

File tree

CONTRIBUTING.md

Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
# Contributing guidelines
2+
3+
We thank you in advance :thumbsup: :tada: for taking the time to contribute, whether with *code* or with *ideas*, to the project.
4+
5+
6+
## Did you find a bug?
7+
8+
* Ensure that the bug was not already reported by [searching under Issues].
9+
10+
* If you're unable to find an (open) issue addressing the problem, [open a new one]. Be sure to prefix the issue title with **[BUG]** and to include:
11+
12+
- a *clear* description,
13+
- as much relevant information as possible, and
14+
- a *code sample* or an (executable) *test case* demonstrating the expected behaviour that is not occurring.
15+
16+
## How to work on a new feature/bug
17+
18+
Create an issue on Github or you can alternatively pick one already created.
19+
20+
Assign yourself to that issue.
21+
22+
Discussions on how to proceed about that issue take place in the comment section on that issue.
23+
24+
Some of the work might have been done already by somebody, hence we avoid unnecessary work duplication and a waste of time and effort. Other reason for discussing the issue beforehand is to communicate with the team the changes as some of the features might impact different components, and we can plan accordingly.
25+
26+
## How we work with Git
27+
28+
All work should take place in a dedicated branch with a short descriptive name.
29+
30+
Use comments in your code, choose variable and function names that clearly show what you intend to implement.
31+
32+
Once the feature is done you can request it to be merged back into `main` by making a Pull Request.
33+
34+
Before making the pull request it is a good idea to rebase your branch to `main` to ensure that eventual conflicts with the `main` branch is solved before the PR is reviewed and we can therefore have a clean merge.
35+
36+
37+
### General stuff about git and commit messages
38+
39+
In general it is better to commit often. Small commits are easier to roll back and also makes the code easier to review.
40+
41+
Write helpful commit messages that describes the changes and possibly why they were necessary.
42+
43+
Each commit should contain changes that are functionally connected and/or related. If you for example want to write _and_ in the first line of the commit message this is an indicator that it should have been two commits.
44+
45+
Learn how to select chunks of changed files to do multiple separate commits of unrelated things. This can be done with either `git add -p` or `git commit -p`.
46+
47+
48+
#### Helpful commit messages
49+
50+
The commit messages may be seen as meta-comments on the code that are incredibly helpful for anyone who wants to know how this piece of software is working, including colleagues (current and future) and external users.
51+
52+
Some tips about writing helpful commit messages:
53+
54+
1. Separate subject (the first line of the message) from body with a blank line.
55+
2. Limit the subject line to 50 characters.
56+
3. Capitalize the subject line.
57+
4. Do not end the subject line with a period.
58+
5. Use the imperative mood in the subject line.
59+
6. Wrap the body at 72 characters.
60+
7. Use the body to explain what and why vs. how.
61+
62+
For an in-depth explanation of the above points, please see [How to Write a Git Commit Message](http://chris.beams.io/posts/git-commit/).
63+
64+
65+
### How we do code reviews
66+
67+
A code review is initiated when someone has made a Pull Request in the appropriate repo on github.
68+
69+
Work should not continue on the branch _unless_ it is a [Draft Pull Request](https://github.blog/2019-02-14-introducing-draft-pull-requests/). Once the PR is marked ready the review can start.
70+
71+
The initiator of the PR should recruit a reviewer that get assigned reviewer duty on the branch.
72+
73+
Other people may also look at and review the code.
74+
75+
A reviewers job is to:
76+
77+
* Write polite and friendly comments - remember that it can be tough to have other people critizising your work, a little kindness goes a long way. This does not mean we should not comment on things that need to be changed of course.
78+
* Read the code and make sure it is understandable
79+
* Make sure that commit messages and commits are structured so that it is possible to understand why certain changes were made.
80+
81+
Once the review is positive the Pull Request can be _merged_ into `main` and the feature branch deleted.
82+
83+
84+
----
85+
86+
Thanks again.
87+
88+
[searching under Issues]: https://github.com/Juke34/RAIN/issues?utf8=%E2%9C%93&q=is%3Aissue%20label%3Abug%20%5BBUG%5D%20in%3Atitle
89+
[open a new one]: https://github.com/Juke34/RAIN/issues/new?title=%5BBUG%5D

README.md

Lines changed: 205 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,205 @@
1+
![GitHub CI](https://github.com/Juke34/RAIN/actions/workflows/main.yml/badge.svg)
2+
3+
# RAIN - RNA Alterations Investigation using Nextflow
4+
5+
RAIN is a Nextflow workflow designed for epitranscriptomic analyses, enabling the detection of RNA modifications in comparison to a reference genome.
6+
Its primary goal is to distinguish true RNA editing events from genomic variants such as SNPs, with a particular emphasis on identifying A-to-I (Adenosine-to-Inosine) editing.
7+
8+
<img src="doc/img/IRD.png" width="300" height="100" /> <img src="doc/img/MIVEGEC.png" width="150" height="100" />
9+
10+
<img src="doc/img/baargin_flowchart.jpg" width="900" height="500" />
11+
12+
## Table of Contents
13+
14+
* [Foreword](#foreword)
15+
* [Flowchart](#flowchart)
16+
* [Installation](#installation)
17+
* [Nextflow](#nextflow)
18+
* [Container platform](#container-platform)
19+
* [Docker](#docker)
20+
* [Singularity](#singularity)
21+
* [Usage and test](#usage)
22+
* [Parameters](#parameters)
23+
* [Output](#output)
24+
* [Author](#author-and-contributors)
25+
* [Contributing](#contributing)
26+
27+
28+
## Foreword
29+
30+
...
31+
32+
## Flowchart
33+
34+
...
35+
36+
## Installation
37+
38+
The prerequisites to run the pipeline are:
39+
40+
* [Nextflow](https://www.nextflow.io/) >= 22.04.0
41+
* [Docker](https://www.docker.com) or [Singularity](https://sylabs.io/singularity/)
42+
43+
### Nextflow
44+
45+
* Via conda
46+
47+
<details>
48+
<summary>See here</summary>
49+
50+
```bash
51+
conda create -n nextflow
52+
conda activate nextflow
53+
conda install bioconda::nextflow
54+
```
55+
</details>
56+
57+
* Manually
58+
<details>
59+
<summary>See here</summary>
60+
Nextflow runs on most POSIX systems (Linux, macOS, etc) and can typically be installed by running these commands:
61+
62+
```bash
63+
# Make sure 11 or later is installed on your computer by using the command:
64+
java -version
65+
66+
# Install Nextflow by entering this command in your terminal(it creates a file nextflow in the current dir):
67+
curl -s https://get.nextflow.io | bash
68+
69+
# Add Nextflow binary to your user's PATH:
70+
mv nextflow ~/bin/
71+
# OR system-wide installation:
72+
# sudo mv nextflow /usr/local/bin
73+
```
74+
</details>
75+
76+
### Container platform
77+
78+
To run the workflow you will need a container platform: docker or singularity.
79+
80+
### Docker
81+
82+
Please follow the instructions at the [Docker website](https://docs.docker.com/desktop/)
83+
84+
### Singularity
85+
86+
Please follow the instructions at the [Singularity website](https://docs.sylabs.io/guides/latest/admin-guide/installation.html)
87+
88+
## Usage
89+
90+
### Help
91+
92+
You can first check the available options and parameters by running:
93+
94+
```bash
95+
nextflow run Juke34/RAIN -r v1.5.0 --help
96+
```
97+
98+
### Profile
99+
100+
To run the workflow you must select a profile according to the container platform you want to use:
101+
- `singularity`, a profile using Singularity to run the containers
102+
- `docker`, a profile using Docker to run the containers
103+
104+
The command will look like that:
105+
106+
```bash
107+
nextflow run Juke34/RAIN -r vX.X.X -profile docker <rest of paramaters>
108+
```
109+
110+
Another profile is available (/!\\ actually not yet implemented):
111+
112+
- `slurm`, to add if your system has a slurm executor (local by default)
113+
114+
The use of the `slurm` profile will give a command like this one:
115+
116+
```bash
117+
nextflow run Juke34/RAIN -r vX.X.X -profile singularity,slurm <rest of paramaters>
118+
```
119+
120+
### Test
121+
122+
With nextflow and docker available you can run (where vX.X.X is the release version you wish to use):
123+
124+
```bash
125+
nextflow run -profile docker,test Juke34/RAIN -r vX.X.X
126+
```
127+
128+
Or via a clone of the repository:
129+
130+
```
131+
git clone https://github.com/Juke34/rain.git
132+
cd rain
133+
nextflow run -profile docker,test rain.nf
134+
```
135+
136+
## Parameters
137+
138+
```
139+
RAIN - RNA Alterations Investigation using Nextflow - v0.1
140+
141+
Usage example:
142+
nextflow run rain.nf -profile docker --genome /path/to/genome.fa --annotation /path/to/annotation.gff3 --reads /path/to/reads_folder --output /path/to/output --aligner hisat2
143+
144+
Parameters:
145+
--help Prints the help section
146+
147+
Input sequences:
148+
--annotation Path to the annotation file (GFF or GTF)
149+
--reads path to the reads file, folder or csv. If a folder is provided, all the files with proper extension in the folder will be used. You can provide remote files (commma separated list).
150+
file extension expected : <.fastq.gz>, <.fq.gz>, <.fastq>, <.fq> or <.bam>.
151+
for paired reads extra <_R1_001> or <_R2_001> is expected where <R> and <_001> are optional. e.g. <sample_id_1.fastq.gz>, <sample_id_R1.fastq.gz>, <sample_id_R1_001.fastq.gz>)
152+
csv input expects 6 columns: sample, fastq_1, fastq_2, strandedness and read_type.
153+
fastq_2 is optional and can be empty. Strandedness, read_type expects same values as corresponding RAIN parameter; If a value is provided via RAIN paramter, it will override the value in the csv file.
154+
Example of csv file:
155+
sample,fastq_1,fastq_2,strandedness,read_type
156+
control1,path/to/data1.fastq.bam,,auto,short_single
157+
control2,path/to/data2_R1.fastq.gz,path/to/data2_R2.fastq.gz,auto,short_paired
158+
--genome Path to the reference genome in FASTA format.
159+
--read_type Type of reads among this list [short_paired, short_single, pacbio, ont] (no default)
160+
161+
Output:
162+
--output Path to the output directory (default: result)
163+
164+
Optional input:
165+
--aligner Aligner to use [default: hisat2]
166+
--edit_site_tool Tool used for detecting edited sites. Default: reditools3
167+
--strandedness Set the strandedness for all your input reads (default: null). In auto mode salmon will guess the library type for each fastq sample. [ 'U', 'IU', 'MU', 'OU', 'ISF', 'ISR', 'MSF', 'MSR', 'OSF', 'OSR', 'auto' ]
168+
--edit_threshold Minimal number of edited reads to count a site as edited (default: 1)
169+
--aggregation_mode Mode for aggregating edition counts mapped on genomic features. See documentation for details. Options are: "all" (default) or "cds_longest"
170+
--clipoverlap Clip overlapping sequences in read pairs to avoid double counting. (default: false)
171+
172+
Nextflow options:
173+
-profile Change the profile of nextflow both the engine and executor more details on github README [debug, test, itrop, singularity, local, docker]
174+
```
175+
176+
## Output
177+
178+
Here the description of typical ouput you will get from RAIN:
179+
180+
```
181+
└── rain_results # Output folder set using --outdir. Default: <alignment_results>
182+
183+
├── AliNe # Folder containing AliNe alignment pipeline result (see https://github.com/Juke34/AliNe)
184+
│ ├── alignment # bam alignment used by RAIN
185+
│ ├── salmon_strandedness # strandedness collected by AliNe in case auto mode was in used for fastq files
186+
│ └── ...
187+
188+
├── bam_indicies # bam and indices bam.bai
189+
190+
├── FastQC # bam and indices bam.bai
191+
192+
├── gatk_markduplicates # metrics and bam after markduplicates
193+
194+
└── Reditools2/Reditools3/Jacusa/sapin/ # Editing output from corresponding tool
195+
196+
└── feature_edits # Editing computed at different level (genomic features, chromosome, global)
197+
198+
## Author and contributors
199+
200+
Eduardo Ascarrunz (@eascarrunz)
201+
Jacques Dainat (@Juke34)
202+
203+
## Contributing
204+
205+
Contributions from the community are welcome ! See the [Contributing guidelines](https://github.com/Juke34/rain/blob/main/CONTRIBUTING.md)

build_images.sh

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -28,11 +28,9 @@ do
2828
echo ██████████████████▓▒░ Building ${imgname} ░▒▓██████████████████
2929

3030
# Reditools2 does not compile on arm64, force using amd64 compilation
31-
if [[ $dir =~ "reditools2" ]];then
32-
if [[ "$arch" == arm* || "$arch" == "aarch64" ]]; then
33-
echo "Reditools2 does not compile on arm64, force using amd64 compilation"
34-
docker_arch_option=" --platform linux/amd64"
35-
fi
31+
if [[ "$arch" == arm* || "$arch" == "aarch64" ]]; then
32+
echo "Reditools2 does not compile on arm64, force using amd64 compilation"
33+
docker_arch_option=" --platform linux/amd64"
3634
fi
3735

3836
docker build ${docker_arch_option} -t ${imgname} .

data/chr21/chr21_small.bam

3 MB
Binary file not shown.

data/chr21/chr21_small.csv

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
sample,input_1,input_2,strandedness,read_type
2+
test1,/Users/jacquesdainat/git/Juke34/rain/data/chr21/chr21_small_R1.fastq.gz,,auto,short_single
3+
test2,/Users/jacquesdainat/git/Juke34/rain/data/chr21/chr21_small_R2.fastq.gz,,ISR,short_single

modules/aline.nf

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ process AliNe {
3434
read_type,
3535
aligner,
3636
library_type,
37+
"--data_type rna",
3738
"--outdir $task.workDir/AliNe",
3839
].join(" ")
3940
// Copy command to shell script in work dir for reference/debugging.

modules/reditools3.nf

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,19 +14,19 @@ process reditools3 {
1414
script:
1515
// Set the strand orientation parameter from the library type parameter
1616
// Terms explained in https://salmon.readthedocs.io/en/latest/library_type.html
17-
if (meta.libtype in ["ISR", "SR"]) {
17+
if (meta.strandedness in ["ISR", "SR"]) {
1818
// First-strand oriented
1919
strand_orientation = "2"
2020
} else if (meta.libtype in ["ISF", "SF"]) {
2121
// Second-strand oriented
2222
strand_orientation = "1"
23-
} else if (meta.libtype in ["IU", "U"]) {
23+
} else if (meta.strandedness in ["IU", "U"]) {
2424
// Unstranded
2525
strand_orientation = "0"
2626
} else {
2727
// Unsupported: Pass the library type string so that it's reported in
2828
// the reditools error message
29-
strand_orientation = meta.libtype
29+
strand_orientation = "0"
3030
}
3131
base_name = bam.BaseName
3232

nextflow.config

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -55,10 +55,10 @@ profiles {
5555
test {
5656
params.aline_profiles = "${baseDir}/config/resources/base_aline.config"
5757
params.aligner = "STAR"
58-
params.reads = "${baseDir}/data/chr21/chr21_small_R1.fastq.gz "
58+
params.reads = "${baseDir}/data/chr21/chr21_small_R1.fastq.gz"
5959
params.genome = "${baseDir}/data/chr21/chr21_small.fasta.gz"
6060
params.annotation = "${baseDir}/data/chr21/chr21_small_filtered.gff3.gz"
61-
params.library_type = "ISR"
61+
params.strandedness = "ISR"
6262
params.read_type = "short_single"
6363
}
6464
test2 {
@@ -67,7 +67,7 @@ profiles {
6767
params.reads = "${baseDir}/data/chr21/"
6868
params.genome = "${baseDir}/data/chr21/chr21_small.fasta.gz"
6969
params.annotation = "${baseDir}/data/chr21/chr21_small_filtered.gff3.gz"
70-
params.library_type = "ISR"
70+
params.strandedness = "ISR"
7171
params.read_type = "short_paired"
7272
}
7373
}

0 commit comments

Comments
 (0)