Skip to content

nasif-raihan/BioMetaGenie

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧬 BioMetaGenie

release-v1.0 license-MIT

BioMetaGenie is an advanced bioinformatics toolkit designed to streamline and integrate the processes of downloading, processing, and analyzing genomic data. By consolidating multiple powerful tools into one intuitive CLI interface, BioMetaGenie provides an efficient end-to-end workflow for genomic data preprocessing, significantly easing the workload for researchers.

summary

Features

  • Efficient Data Retrieval: Seamlessly download genomic data from NCBI using the SRA Toolkit.
  • Automated Conversion: Effortlessly convert downloaded files to FASTQ format.
  • High-Quality Trimming: Utilize TrimGalore for high-quality read trimming.
  • Detailed Sequence Reporting: Generate detailed sequence status reports with Seqkit.
  • Read Merging: Automatically merge paired-end reads into cohesive sequences.
  • In-Depth Analysis: Leverage Parallel Meta for comprehensive sequence abundance and count analysis.

Installation

To get started with BioMetaGenie, follow these instructions:

  1. Prerequisites:

    • Ensure that Poetry is installed for dependency management.
  2. Clone the Repository:

    git clone https://github.com/nasif-raihan/BioMetaGenie.git
    cd BioMetaGenie
  3. Setup Third-Party Dependencies:

    Currently, the setup main only supports Linux distributions. Contributions are welcome to extend cross-platform compatibility by creating setup-for-win.sh for Windows and setup-for-mac.sh for macOS.

  4. Make Usearch11 Executable:

    chmod +x usearch11.0.667_i86linux32
    cd ..
  5. Configuration:

    • Place your sample names or SRA accession numbers in the SRA_list.txt file located in the input directory.

Usage

BioMetaGenie simplifies complex workflows into a single command. After installation, run the following command from root directory (BioMetaGenie) to execute the entire process:

    make install
    poetry shell
    make run

Some other example usages:

  • Download SRA:
        python main.py download_sra SRR123456
  • Convert to FASTQ:
        python main.py convert_to_fastq SRR123456
  • Download and process a list:
        python main.py process_sra_list
  • Trim sequences:
        python main.py trim
  • Get sample stats:
        python main.py get_sample_stats
  • Merge reads:
        python main.py merge_reads sample123
  • Analyze:
        python main.py analyze

All the outputs will be stored in the output directory.

Contributing

We welcome contributions to BioMetaGenie! To contribute, please fork the repository and submit a pull request. Ensure that your code adheres to the project's coding standards and includes appropriate tests.

License

BioMetaGenie is released under the MIT License. For details, see the LICENSE file.

Contact

For support or inquiries, please use the Issues section on GitHub.


BioMetaGenie is committed to simplifying and accelerating genomic data processing, enabling researchers to concentrate on their analyses rather than on data management tasks.

About

BioMetaGenie - a wrapper toolkit

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors