A workflow made with snakemake to analyse a specific gene expression.
RNA-Seq-analysis is an open software workflow using snakmeake to analyze clinical data following the procedure of, the STAR Protocol, Analysis workflow of publicly available RNA-sequencing datasets. The two graphics below show the different rules in a graph and their respective files that will be in- and output.
Ensure you have the required dependencies:
- mamba >= 1.5 (alternatively conda >= 23.11)
- an environment that has snakemake >= 8.5 installed as specified here
- a stable internet connection
-
Clone this repository via:
git clone https://github.com/ToLeWeiss/RNA-Seq-analysis.git --depth 1 -
Change the config if necessary (i.e. if you want to use a different GEO dataset or analyze a different set of genes)
-
Activate the base environment and your snakemake environment
If you use mamba:mamba activate base && mamba activate <your snakemake environment>If you use conda:
conda activate base && conda activate <your snakemake environment> -
Run the workflow: you can specify your preferred number of cores yourself in the below code it is set to 1, which should be ok for all systems.
snakemake --use-conda --cores 1
OPTIONAL
- Generate a report from the executed workflow
if you want to generate a web report, after the workflow has run, you can generate one as follows:
If you want to change the text from the report you can go to the report directory and change the respective *.rst file.
snakemake --report report.html
STARS Protocol
Original Paper
Snakemake
R Packages
-
DESeq2
-
ggplot2
-
H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.
-
-
GGally
-
Schloerke B, Cook D, Larmarange J, Briatte F, Marbach M, Thoen E, Elberg A, Crowley J (2024). GGally: Extension to 'ggplot2'.
-
-
canvasXpress
-
factoextra
-
clinfun
-
Seshan V, Whiting K (2023). clinfun: Clinical Trial Design and Data Analysis Functions. R package version 1.1.1.
-
-
GEOquery
Pandoc