BioSeq Analyzer is an interactive web application and Python toolkit for analyzing mitochondrial DNA (mtDNA) sequences. It provides intuitive ways to upload FASTA files, inspect DNA properties like GC content, identify conserved motifs, and perform global or local pairwise alignments. Built with modular object-oriented design, it is easily extendable for future genomic tools.
- Analyze mitochondrial genomes from any species
- Visualize sequence statistics and motif frequencies
- Align and compare DNA sequences using classic algorithms
- Designed for extensibility, reusability, and teaching purposes
BioSeq Analyzer is a modular software solution for analyzing mitochondrial DNA (mtDNA) sequences. Built with object-oriented programming (OOP) principles, it offers both command-line and web-based interfaces (using Flask) for genomic data analysis. Key functionalities include parsing FASTA files, representing DNA sequences and motifs as objects, performing sequence analysis (GC content, motif search, alignments), and comparing sequences.
├── app.py # Flask web application
├── tools.py # Parsing, alignment, and motif-finding tools
├── sequence.py # MitochondrialDNA and sequence representation
├── comparer.py # High-level analysis and integration
├── templates/ # HTML templates for the web interface
│ ├── align.html
│ ├── base.html
│ ├── compare_reference.html
│ ├── index.html
│ ├── motif.html
│ └── summary.html
├── uploads/ # Directory for uploaded FASTA files
├── P53_human.fasta # Example FASTA file
├── synthetic_mtDNA_dataset.fasta # Example input file
├── requirements.txt # Python dependencies
├── documentation.md # Project documentation
├── README.md # This file
└── UML-diagram.png # UML diagram visible in the documentation.md
To run the Bio Seq Analyzer web application locally:
git clone https://github.com/hades-k/bio-seq-analyzer.git
cd bio-seq-analyzerpython -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install -r requirements.txtpython app.pyNavigate to http://127.0.0.1:5000 in your browser.
- Upload FASTA files
- View sequence statistics (length, GC content)
- Motif search and k-mer discovery
- Global & local sequence alignment
- View plots (GC content, motif distribution)
This project is structured around modern OOP principles:
Classes like MitochondrialDNA and SequenceAligner encapsulate internal data (e.g., sequences, alignment results), exposing functionality through clean public interfaces.
Sequence and Tool are abstract base classes enforcing essential methods (run, report, etc.) in all subclasses, enabling modular and consistent logic.
MitochondrialDNAinherits fromSequenceMotifFinder,SequenceAligner, andParserall inherit fromTool
This promotes reuse and shared structure across tools.
Generic interfaces (run(), report()) allow tools like Parser, MotifFinder, and SequenceAligner to be used interchangeably in pipelines and the frontend.
- Upload a FASTA file on the main page.
- View summary statistics and GC content for all sequences.
- Search for a specific motif or discover conserved motifs.
- Perform pairwise alignments between any two sequences.
- View all pairwise alignment scores in a table.
Full documentation can be found here
- The code is designed to be modular and follows OOP principles.
- The system can be extended with new analysis tools or web features.
- For debugging, check terminal error messages or use the command-line interface of individual modules.
- Developed for Advanced Programming 2024/2025
- Professor: Andrea Giovanni Nuzzolese
- Authors: Salma Maria Mede, Aidana Kaukenova