Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Contributing guidelines

We thank you in advance :thumbsup: :tada: for taking the time to contribute, whether with *code* or with *ideas*, to the project.


## Did you find a bug?

* Ensure that the bug was not already reported by [searching under Issues].

* If you're unable to find an (open) issue addressing the problem, [open a new one]. Be sure to prefix the issue title with **[BUG]** and to include:

- a *clear* description,
- as much relevant information as possible, and
- a *code sample* or an (executable) *test case* demonstrating the expected behaviour that is not occurring.

## How to work on a new feature/bug

Create an issue on Github or you can alternatively pick one already created.

Assign yourself to that issue.

Discussions on how to proceed about that issue take place in the comment section on that issue.

Some of the work might have been done already by somebody, hence we avoid unnecessary work duplication and a waste of time and effort. Other reason for discussing the issue beforehand is to communicate with the team the changes as some of the features might impact different components, and we can plan accordingly.

## How we work with Git

All work should take place in a dedicated branch with a short descriptive name.

Use comments in your code, choose variable and function names that clearly show what you intend to implement.

Once the feature is done you can request it to be merged back into `main` by making a Pull Request.

Before making the pull request it is a good idea to rebase your branch to `main` to ensure that eventual conflicts with the `main` branch is solved before the PR is reviewed and we can therefore have a clean merge.


### General stuff about git and commit messages

In general it is better to commit often. Small commits are easier to roll back and also makes the code easier to review.

Write helpful commit messages that describes the changes and possibly why they were necessary.

Each commit should contain changes that are functionally connected and/or related. If you for example want to write _and_ in the first line of the commit message this is an indicator that it should have been two commits.

Learn how to select chunks of changed files to do multiple separate commits of unrelated things. This can be done with either `git add -p` or `git commit -p`.


#### Helpful commit messages

The commit messages may be seen as meta-comments on the code that are incredibly helpful for anyone who wants to know how this piece of software is working, including colleagues (current and future) and external users.

Some tips about writing helpful commit messages:

1. Separate subject (the first line of the message) from body with a blank line.
2. Limit the subject line to 50 characters.
3. Capitalize the subject line.
4. Do not end the subject line with a period.
5. Use the imperative mood in the subject line.
6. Wrap the body at 72 characters.
7. Use the body to explain what and why vs. how.

For an in-depth explanation of the above points, please see [How to Write a Git Commit Message](http://chris.beams.io/posts/git-commit/).


### How we do code reviews

A code review is initiated when someone has made a Pull Request in the appropriate repo on github.

Work should not continue on the branch _unless_ it is a [Draft Pull Request](https://github.blog/2019-02-14-introducing-draft-pull-requests/). Once the PR is marked ready the review can start.

The initiator of the PR should recruit a reviewer that get assigned reviewer duty on the branch.

Other people may also look at and review the code.

A reviewers job is to:

* Write polite and friendly comments - remember that it can be tough to have other people critizising your work, a little kindness goes a long way. This does not mean we should not comment on things that need to be changed of course.
* Read the code and make sure it is understandable
* Make sure that commit messages and commits are structured so that it is possible to understand why certain changes were made.

Once the review is positive the Pull Request can be _merged_ into `main` and the feature branch deleted.


----

Thanks again.

[searching under Issues]: https://github.com/Juke34/RAIN/issues?utf8=%E2%9C%93&q=is%3Aissue%20label%3Abug%20%5BBUG%5D%20in%3Atitle
[open a new one]: https://github.com/Juke34/RAIN/issues/new?title=%5BBUG%5D
205 changes: 205 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
![GitHub CI](https://github.com/Juke34/RAIN/actions/workflows/main.yml/badge.svg)

# RAIN - RNA Alterations Investigation using Nextflow

RAIN is a Nextflow workflow designed for epitranscriptomic analyses, enabling the detection of RNA modifications in comparison to a reference genome.
Its primary goal is to distinguish true RNA editing events from genomic variants such as SNPs, with a particular emphasis on identifying A-to-I (Adenosine-to-Inosine) editing.

<img src="doc/img/IRD.png" width="300" height="100" /> <img src="doc/img/MIVEGEC.png" width="150" height="100" />

<img src="doc/img/baargin_flowchart.jpg" width="900" height="500" />

## Table of Contents

* [Foreword](#foreword)
* [Flowchart](#flowchart)
* [Installation](#installation)
* [Nextflow](#nextflow)
* [Container platform](#container-platform)
* [Docker](#docker)
* [Singularity](#singularity)
* [Usage and test](#usage)
* [Parameters](#parameters)
* [Output](#output)
* [Author](#author-and-contributors)
* [Contributing](#contributing)


## Foreword

...

## Flowchart

...

## Installation

The prerequisites to run the pipeline are:

* [Nextflow](https://www.nextflow.io/) >= 22.04.0
* [Docker](https://www.docker.com) or [Singularity](https://sylabs.io/singularity/)

### Nextflow

* Via conda

<details>
<summary>See here</summary>

```bash
conda create -n nextflow
conda activate nextflow
conda install bioconda::nextflow
```
</details>

* Manually
<details>
<summary>See here</summary>
Nextflow runs on most POSIX systems (Linux, macOS, etc) and can typically be installed by running these commands:

```bash
# Make sure 11 or later is installed on your computer by using the command:
java -version

# Install Nextflow by entering this command in your terminal(it creates a file nextflow in the current dir):
curl -s https://get.nextflow.io | bash

# Add Nextflow binary to your user's PATH:
mv nextflow ~/bin/
# OR system-wide installation:
# sudo mv nextflow /usr/local/bin
```
</details>

### Container platform

To run the workflow you will need a container platform: docker or singularity.

### Docker

Please follow the instructions at the [Docker website](https://docs.docker.com/desktop/)

### Singularity

Please follow the instructions at the [Singularity website](https://docs.sylabs.io/guides/latest/admin-guide/installation.html)

## Usage

### Help

You can first check the available options and parameters by running:

```bash
nextflow run Juke34/RAIN -r v1.5.0 --help
```

### Profile

To run the workflow you must select a profile according to the container platform you want to use:
- `singularity`, a profile using Singularity to run the containers
- `docker`, a profile using Docker to run the containers

The command will look like that:

```bash
nextflow run Juke34/RAIN -r vX.X.X -profile docker <rest of paramaters>
```

Another profile is available (/!\\ actually not yet implemented):

- `slurm`, to add if your system has a slurm executor (local by default)

The use of the `slurm` profile will give a command like this one:

```bash
nextflow run Juke34/RAIN -r vX.X.X -profile singularity,slurm <rest of paramaters>
```

### Test

With nextflow and docker available you can run (where vX.X.X is the release version you wish to use):

```bash
nextflow run -profile docker,test Juke34/RAIN -r vX.X.X
```

Or via a clone of the repository:

```
git clone https://github.com/Juke34/rain.git
cd rain
nextflow run -profile docker,test rain.nf
```

## Parameters

```
RAIN - RNA Alterations Investigation using Nextflow - v0.1

Usage example:
nextflow run rain.nf -profile docker --genome /path/to/genome.fa --annotation /path/to/annotation.gff3 --reads /path/to/reads_folder --output /path/to/output --aligner hisat2

Parameters:
--help Prints the help section

Input sequences:
--annotation Path to the annotation file (GFF or GTF)
--reads path to the reads file, folder or csv. If a folder is provided, all the files with proper extension in the folder will be used. You can provide remote files (commma separated list).
file extension expected : <.fastq.gz>, <.fq.gz>, <.fastq>, <.fq> or <.bam>.
for paired reads extra <_R1_001> or <_R2_001> is expected where <R> and <_001> are optional. e.g. <sample_id_1.fastq.gz>, <sample_id_R1.fastq.gz>, <sample_id_R1_001.fastq.gz>)
csv input expects 6 columns: sample, fastq_1, fastq_2, strandedness and read_type.
fastq_2 is optional and can be empty. Strandedness, read_type expects same values as corresponding RAIN parameter; If a value is provided via RAIN paramter, it will override the value in the csv file.
Example of csv file:
sample,fastq_1,fastq_2,strandedness,read_type
control1,path/to/data1.fastq.bam,,auto,short_single
control2,path/to/data2_R1.fastq.gz,path/to/data2_R2.fastq.gz,auto,short_paired
--genome Path to the reference genome in FASTA format.
--read_type Type of reads among this list [short_paired, short_single, pacbio, ont] (no default)

Output:
--output Path to the output directory (default: result)

Optional input:
--aligner Aligner to use [default: hisat2]
--edit_site_tool Tool used for detecting edited sites. Default: reditools3
--strandedness Set the strandedness for all your input reads (default: null). In auto mode salmon will guess the library type for each fastq sample. [ 'U', 'IU', 'MU', 'OU', 'ISF', 'ISR', 'MSF', 'MSR', 'OSF', 'OSR', 'auto' ]
--edit_threshold Minimal number of edited reads to count a site as edited (default: 1)
--aggregation_mode Mode for aggregating edition counts mapped on genomic features. See documentation for details. Options are: "all" (default) or "cds_longest"
--clipoverlap Clip overlapping sequences in read pairs to avoid double counting. (default: false)

Nextflow options:
-profile Change the profile of nextflow both the engine and executor more details on github README [debug, test, itrop, singularity, local, docker]
```

## Output

Here the description of typical ouput you will get from RAIN:

```
└── rain_results # Output folder set using --outdir. Default: <alignment_results>
├── AliNe # Folder containing AliNe alignment pipeline result (see https://github.com/Juke34/AliNe)
│ ├── alignment # bam alignment used by RAIN
│ ├── salmon_strandedness # strandedness collected by AliNe in case auto mode was in used for fastq files
│ └── ...
├── bam_indicies # bam and indices bam.bai
├── FastQC # bam and indices bam.bai
├── gatk_markduplicates # metrics and bam after markduplicates
└── Reditools2/Reditools3/Jacusa/sapin/ # Editing output from corresponding tool
└── feature_edits # Editing computed at different level (genomic features, chromosome, global)

## Author and contributors

Eduardo Ascarrunz (@eascarrunz)
Jacques Dainat (@Juke34)

## Contributing

Contributions from the community are welcome ! See the [Contributing guidelines](https://github.com/Juke34/rain/blob/main/CONTRIBUTING.md)
8 changes: 3 additions & 5 deletions build_images.sh
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,9 @@ do
echo ██████████████████▓▒░ Building ${imgname} ░▒▓██████████████████

# Reditools2 does not compile on arm64, force using amd64 compilation
if [[ $dir =~ "reditools2" ]];then
if [[ "$arch" == arm* || "$arch" == "aarch64" ]]; then
echo "Reditools2 does not compile on arm64, force using amd64 compilation"
docker_arch_option=" --platform linux/amd64"
fi
if [[ "$arch" == arm* || "$arch" == "aarch64" ]]; then
echo "Reditools2 does not compile on arm64, force using amd64 compilation"
docker_arch_option=" --platform linux/amd64"
fi

docker build ${docker_arch_option} -t ${imgname} .
Expand Down
Binary file added data/chr21/chr21_small.bam
Binary file not shown.
3 changes: 3 additions & 0 deletions data/chr21/chr21_small.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
sample,input_1,input_2,strandedness,read_type
test1,/Users/jacquesdainat/git/Juke34/rain/data/chr21/chr21_small_R1.fastq.gz,,auto,short_single
test2,/Users/jacquesdainat/git/Juke34/rain/data/chr21/chr21_small_R2.fastq.gz,,ISR,short_single
1 change: 1 addition & 0 deletions modules/aline.nf
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ process AliNe {
read_type,
aligner,
library_type,
"--data_type rna",
"--outdir $task.workDir/AliNe",
].join(" ")
// Copy command to shell script in work dir for reference/debugging.
Expand Down
6 changes: 3 additions & 3 deletions modules/reditools3.nf
Original file line number Diff line number Diff line change
Expand Up @@ -14,19 +14,19 @@ process reditools3 {
script:
// Set the strand orientation parameter from the library type parameter
// Terms explained in https://salmon.readthedocs.io/en/latest/library_type.html
if (meta.libtype in ["ISR", "SR"]) {
if (meta.strandedness in ["ISR", "SR"]) {
// First-strand oriented
strand_orientation = "2"
} else if (meta.libtype in ["ISF", "SF"]) {
// Second-strand oriented
strand_orientation = "1"
} else if (meta.libtype in ["IU", "U"]) {
} else if (meta.strandedness in ["IU", "U"]) {
// Unstranded
strand_orientation = "0"
} else {
// Unsupported: Pass the library type string so that it's reported in
// the reditools error message
strand_orientation = meta.libtype
strand_orientation = "0"
}
base_name = bam.BaseName

Expand Down
6 changes: 3 additions & 3 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -55,10 +55,10 @@ profiles {
test {
params.aline_profiles = "${baseDir}/config/resources/base_aline.config"
params.aligner = "STAR"
params.reads = "${baseDir}/data/chr21/chr21_small_R1.fastq.gz "
params.reads = "${baseDir}/data/chr21/chr21_small_R1.fastq.gz"
params.genome = "${baseDir}/data/chr21/chr21_small.fasta.gz"
params.annotation = "${baseDir}/data/chr21/chr21_small_filtered.gff3.gz"
params.library_type = "ISR"
params.strandedness = "ISR"
params.read_type = "short_single"
}
test2 {
Expand All @@ -67,7 +67,7 @@ profiles {
params.reads = "${baseDir}/data/chr21/"
params.genome = "${baseDir}/data/chr21/chr21_small.fasta.gz"
params.annotation = "${baseDir}/data/chr21/chr21_small_filtered.gff3.gz"
params.library_type = "ISR"
params.strandedness = "ISR"
params.read_type = "short_paired"
}
}
Expand Down
Loading