Skip to content

rdo-adan/Metadoon

Repository files navigation

🧪 Metadoon

Metadoon Interface



Docker Conda Python R License

User-friendly graphical interface and pipeline for amplicon-based metagenomic data analysis.


Metadoon automates the workflow from FASTQ preprocessing to robust statistical visualization in R, utilizing tools like VSEARCH and Phyloseq. It features a streamlined 5-step interface and runs easily via Docker or Natively via Conda.


📦 What's Included

The environment includes:

Component Purpose
Python 3.10 GUI interface (Tkinter) and pipeline logic
R (Latest) Statistical analysis and plotting
VSEARCH FASTQ processing (merge, filter, cluster)
Libraries phyloseq, DESeq2, ggplot2, vegan, etc.

🚀 Option 1: One-Click Launchers

Easy start scripts for all platforms.

⚠️ First-Time Setup (Permissions)

For macOS (.command) and Linux (.sh)/Or WSL users only: Before running the scripts for the first time, you must grant execution permissions via terminal.

  1. Open a terminal inside the Metadoon folder.
  2. Run the command:
    chmod +x *

Note: Windows users (.bat) DO NOT need this step. You can run the file directly.

1. Prerequisites by OS

  • Windows & Linux: Docker installed (Enable WSL 2 for Windows).
  • macOS: Conda installed.
    • The macOS .command launcher runs the Native Conda version, not Docker.

2. How to Run

Just double-click the launcher for your OS:

  • 🪟 Windows: Double-click Windows_Run.bat (Runs Docker).
  • 🍎 macOS: Double-click MacOS_Run.command (Runs Conda/Native).
  • 🐧 Linux: Run ./Linux_Run.sh (Runs Docker).

🐍 Option 2: Manual Installation (Terminal)

Recommended for Linux/WSL users or advanced users who prefer manual control.

Follow these steps to run Metadoon directly on your system without the one-click scripts.

1. Prerequisites

  • Conda (Anaconda or Miniconda) must be installed.

2. Installation & Execution

Open your terminal and run the following commands in order:

Step 1: Clone the repository

git clone https://github.com/rdo-adan/Metadoon.git

Step 2: Enter the directory

cd Metadoon/

Step 3: Grant execution permissions Essential to ensure all scripts can run.

chmod +x *

Step 4: Install dependencies This script creates the metadoon environment and installs R, Python, and VSEARCH.

bash setup.sh

Step 5: Activate environment & Run

conda activate metadoon
python metadoon.py

🖥️ Interface & Workflow

The new interface guides you through 5 simple steps:

  1. Load FASTQ Files: Select your raw data (must contain _R1_ and _R2_).
  2. Configure Parameters: Adjust threads, max errors, and databases (optional).
  3. RUN PIPELINE: Starts the analysis (Merge -> Filter -> Cluster -> Taxonomy -> Stats).
  4. Generate Report: Creates the final HTML summary after the run finishes.
  5. Save Results: Exports all tables, plots, and reports to a clean folder.

📂 Handling Files (Docker Users)

If using Docker (Windows/Linux script), Metadoon maps your local folders:

  • /workspaceMetadoon folder (Results saved here).
  • /app/YOUR_DATAUser Profile (Documents, Downloads).
  • /app/C_DriveC: Drive (Windows only).

💡 Native/macOS Users: You have direct access to your entire file system.


⚙️ Pipeline Details

  1. Merge Pairs: Merges R1 and R2 using VSEARCH.
  2. Quality Filter: Filters reads based on MaxEE.
  3. Dereplication: Identifies unique sequences.
  4. Clustering: OTU (97%) or ASV (Denoising).
  5. Chimera Removal: De novo + Reference-based.
  6. Taxonomy: SINTAX algorithm.
  7. Statistics (R): Alpha/Beta Diversity, Rarefaction, DESeq2, ANCOM-BC.

📁 Project Structure

Metadoon automatically manages file organization.

Core Files (Before Run)

Metadoon/
│
├── metadoon.py              # Main GUI script
├── Analise.R                # Statistical analysis script (R)
├── generate_report.R        # Report generation script
├── Metadoon_Report.Rmd      # RMarkdown template
├── pipeline_params.json     # Configuration file
├── metadoon_env.yaml        # Conda environment definition
├── setup.sh                 # Native installation script (Linux)
├── LICENSE                  # License file
├── Readme.md                # Project documentation
├── Windows_Run.bat          # Launcher scripts for Docker (All OS)
├── MacOS_Run.command
├── Linux_Run.sh
└── Example_Data.txt         # Links to Download a dataset for testing

Generated Directories (After Run)

Once the pipeline runs, Metadoon creates specific folders to organize the workflow:

Metadoon/
│
├── DB/                      # Downloaded reference databases (RDP, Silva, etc.)
├── Metadata File/           # Stores the uploaded metadata file
├── Tree File/               # Stores the phylogenetic tree (if provided)
│
├── Merged/                  # Paired-end reads merged by VSEARCH
├── FullFiles/               # Concatenated merged reads
├── Filtered/                # Quality filtered sequences
├── Dereplicated/            # Unique sequences (dereplication)
│
├── OTUs/                    # Clustering results
│   ├── centroids.fasta      # Representative sequences
│   ├── otus.fasta           # Final OTUs/ASVs (non-chimeric)
│   └── otutab.txt           # Abundance table
│
├── Taxonomy/                # Taxonomic classification results
│   ├── taxonomy_raw.txt     # Raw output from SINTAX
│   └── taxonomy.txt         # Cleaned taxonomy table for R
│
└── Output/                  # FINAL RESULTS
    ├── Plots (Alpha/Beta diversity, Heatmaps, Rarefaction)
    ├── Statistical Tables (DESeq2, ANCOM-BC, PERMANOVA)
    └── Metadoon_Report.html # Complete HTML Summary

⚠️ Input Data Requirements

  • Format: Illumina Paired-End .fastq.
  • Naming: Must contain _R1_ and _R2_.
  • No Special Characters: Avoid spaces or extra hyphens in sample names.

📬 Contact

For issues or questions: 📧 rdo.adan@gmail.com

About

A pipeline for amplicon/microbiome analyses.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors