🧬 BioMetaGenie

BioMetaGenie is an advanced bioinformatics toolkit designed to streamline and integrate the processes of downloading, processing, and analyzing genomic data. By consolidating multiple powerful tools into one intuitive CLI interface, BioMetaGenie provides an efficient end-to-end workflow for genomic data preprocessing, significantly easing the workload for researchers.

Features

Efficient Data Retrieval: Seamlessly download genomic data from NCBI using the SRA Toolkit.
Automated Conversion: Effortlessly convert downloaded files to FASTQ format.
High-Quality Trimming: Utilize TrimGalore for high-quality read trimming.
Detailed Sequence Reporting: Generate detailed sequence status reports with Seqkit.
Read Merging: Automatically merge paired-end reads into cohesive sequences.
In-Depth Analysis: Leverage Parallel Meta for comprehensive sequence abundance and count analysis.

Installation

To get started with BioMetaGenie, follow these instructions:

Prerequisites:
- Ensure that Poetry is installed for dependency management.

Clone the Repository:

git clone https://github.com/nasif-raihan/BioMetaGenie.git
cd BioMetaGenie

Setup Third-Party Dependencies:
- Install the required tools by executing the setup main::
```
cd third_party
bash setup-for-linux.sh
```
Currently, the setup main only supports Linux distributions. Contributions are welcome to extend cross-platform compatibility by creating setup-for-win.sh for Windows and setup-for-mac.sh for macOS.

Make Usearch11 Executable:

chmod +x usearch11.0.667_i86linux32
cd ..

Configuration:
- Place your sample names or SRA accession numbers in the SRA_list.txt file located in the input directory.

Usage

BioMetaGenie simplifies complex workflows into a single command. After installation, run the following command from root directory (BioMetaGenie) to execute the entire process:

    make install
    poetry shell
    make run

Some other example usages:

Download SRA:

    python main.py download_sra SRR123456

Convert to FASTQ:

    python main.py convert_to_fastq SRR123456

Download and process a list:
```
    python main.py process_sra_list
```
Trim sequences:
```
    python main.py trim
```
Get sample stats:
```
    python main.py get_sample_stats
```

Merge reads:

    python main.py merge_reads sample123

Analyze:
```
    python main.py analyze
```

All the outputs will be stored in the output directory.

Contributing

We welcome contributions to BioMetaGenie! To contribute, please fork the repository and submit a pull request. Ensure that your code adheres to the project's coding standards and includes appropriate tests.

License

BioMetaGenie is released under the MIT License. For details, see the LICENSE file.

Contact

For support or inquiries, please use the Issues section on GitHub.

BioMetaGenie is committed to simplifying and accelerating genomic data processing, enabling researchers to concentrate on their analyses rather than on data management tasks.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
input		input
service		service
third_party		third_party
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
main.py		main.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧬 BioMetaGenie

Features

Installation

Usage

Some other example usages:

Contributing

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧬 BioMetaGenie

Features

Installation

Usage

Some other example usages:

Contributing

License

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages