Metadoon automates the workflow from FASTQ preprocessing to robust statistical visualization in R, utilizing tools like VSEARCH and Phyloseq. It features a streamlined 5-step interface and runs easily via Docker or Natively via Conda.
The environment includes:
| Component | Purpose |
|---|---|
| Python 3.10 | GUI interface (Tkinter) and pipeline logic |
| R (Latest) | Statistical analysis and plotting |
| VSEARCH | FASTQ processing (merge, filter, cluster) |
| Libraries | phyloseq, DESeq2, ggplot2, vegan, etc. |
Easy start scripts for all platforms.
For macOS (.command) and Linux (.sh)/Or WSL users only:
Before running the scripts for the first time, you must grant execution permissions via terminal.
- Open a terminal inside the Metadoon folder.
- Run the command:
chmod +x *
Note: Windows users (.bat) DO NOT need this step. You can run the file directly.
- Windows & Linux: Docker installed (Enable WSL 2 for Windows).
- macOS: Conda installed.
- The macOS
.commandlauncher runs the Native Conda version, not Docker.
- The macOS
Just double-click the launcher for your OS:
- 🪟 Windows: Double-click
Windows_Run.bat(Runs Docker). - 🍎 macOS: Double-click
MacOS_Run.command(Runs Conda/Native). - 🐧 Linux: Run
./Linux_Run.sh(Runs Docker).
Recommended for Linux/WSL users or advanced users who prefer manual control.
Follow these steps to run Metadoon directly on your system without the one-click scripts.
- Conda (Anaconda or Miniconda) must be installed.
Open your terminal and run the following commands in order:
Step 1: Clone the repository
git clone https://github.com/rdo-adan/Metadoon.gitStep 2: Enter the directory
cd Metadoon/Step 3: Grant execution permissions Essential to ensure all scripts can run.
chmod +x *Step 4: Install dependencies
This script creates the metadoon environment and installs R, Python, and VSEARCH.
bash setup.shStep 5: Activate environment & Run
conda activate metadoon
python metadoon.pyThe new interface guides you through 5 simple steps:
- Load FASTQ Files: Select your raw data (must contain
_R1_and_R2_). - Configure Parameters: Adjust threads, max errors, and databases (optional).
- RUN PIPELINE: Starts the analysis (Merge -> Filter -> Cluster -> Taxonomy -> Stats).
- Generate Report: Creates the final HTML summary after the run finishes.
- Save Results: Exports all tables, plots, and reports to a clean folder.
If using Docker (Windows/Linux script), Metadoon maps your local folders:
/workspace⮕ Metadoon folder (Results saved here)./app/YOUR_DATA⮕ User Profile (Documents, Downloads)./app/C_Drive⮕ C: Drive (Windows only).
💡 Native/macOS Users: You have direct access to your entire file system.
- Merge Pairs: Merges R1 and R2 using VSEARCH.
- Quality Filter: Filters reads based on MaxEE.
- Dereplication: Identifies unique sequences.
- Clustering: OTU (97%) or ASV (Denoising).
- Chimera Removal: De novo + Reference-based.
- Taxonomy: SINTAX algorithm.
- Statistics (R): Alpha/Beta Diversity, Rarefaction, DESeq2, ANCOM-BC.
Metadoon automatically manages file organization.
Metadoon/
│
├── metadoon.py # Main GUI script
├── Analise.R # Statistical analysis script (R)
├── generate_report.R # Report generation script
├── Metadoon_Report.Rmd # RMarkdown template
├── pipeline_params.json # Configuration file
├── metadoon_env.yaml # Conda environment definition
├── setup.sh # Native installation script (Linux)
├── LICENSE # License file
├── Readme.md # Project documentation
├── Windows_Run.bat # Launcher scripts for Docker (All OS)
├── MacOS_Run.command
├── Linux_Run.sh
└── Example_Data.txt # Links to Download a dataset for testing
Once the pipeline runs, Metadoon creates specific folders to organize the workflow:
Metadoon/
│
├── DB/ # Downloaded reference databases (RDP, Silva, etc.)
├── Metadata File/ # Stores the uploaded metadata file
├── Tree File/ # Stores the phylogenetic tree (if provided)
│
├── Merged/ # Paired-end reads merged by VSEARCH
├── FullFiles/ # Concatenated merged reads
├── Filtered/ # Quality filtered sequences
├── Dereplicated/ # Unique sequences (dereplication)
│
├── OTUs/ # Clustering results
│ ├── centroids.fasta # Representative sequences
│ ├── otus.fasta # Final OTUs/ASVs (non-chimeric)
│ └── otutab.txt # Abundance table
│
├── Taxonomy/ # Taxonomic classification results
│ ├── taxonomy_raw.txt # Raw output from SINTAX
│ └── taxonomy.txt # Cleaned taxonomy table for R
│
└── Output/ # FINAL RESULTS
├── Plots (Alpha/Beta diversity, Heatmaps, Rarefaction)
├── Statistical Tables (DESeq2, ANCOM-BC, PERMANOVA)
└── Metadoon_Report.html # Complete HTML Summary
- Format: Illumina Paired-End
.fastq. - Naming: Must contain
_R1_and_R2_. - No Special Characters: Avoid spaces or extra hyphens in sample names.
For issues or questions: 📧 rdo.adan@gmail.com
