Skip to content

This Repository comprises Python Codes for tasks and assignments from courses on genomic algorithm, from efficient codes for naiive exact matching to randomized motif search algorithm and so on.

Notifications You must be signed in to change notification settings

YinkaAdu/Learning_Genomic_Algorithms

Repository files navigation

Algorithms of DNA Sequencing in Python (Ongoing project)

This repository contains a collection of scripts, functions, and exercises developed during my progress through the Genomic Data Science Specialization and the Bioinformatics I Honours Track Certification. Furthermore, my solutions to homework from the rosalind bioinformatics platform has also been uploaded. The goal of this work is to build practical proficiency in implementing and debugging genomic algorithms used in computational biology.

Aims

To efficiently apply fundamental mathematical concepts including combinatorics, set theory and probability to tackle problems in bioinformatics.

To gain hands-on experience with core algorithmic techniques in bioinformatics, including sequence analysis, pattern matching, and motif discovery, as part of a structured genomics specialization program.

Skills and Algorithms Covered

GC content and parsing FASTA

Transcription and reverse complement

Hamming Distance computation for DNA sequence comparison

k-mer Clump Finding to locate high-frequency regions (e.g., origins of replication)

Boyer-Moore Pattern Matching Algorithm for efficient string search

Motif Detection using the MEME Suite

Use of BioPython for parsing and analyzing biological data

Learning Highlights

Strengthening Python skills through string manipulation, pattern matching, and data parsing

Understanding biological concepts like transcription, translation, and motifs

Practicing algorithm design with real biological datasets

Maintaining version control discipline by tracking every step in GitHub

Coursework & Certification

Attained the Spidey achievement on the Rosalind bioinformatics platform by solving 64 (2^6) problems

Completed 10 coding-intensive modules in the Genomic Data Science track

Earned the Bioinformatics I and II Honors Track Certificates, demonstrating mastery of both theory and applied coding tasks

Tools & Libraries

Python (3.10+)

BioPython

MEME Suite

SPAdes

QUAST

Jupyter Notebook (for stepwise development)

Completed Steps

Extend repository with dynamic programming algorithms (e.g., global/local alignment)

Implement basic genome assembly techniques (e.g., De Bruijn graphs)

Apply algorithms to real sequencing datasets (FASTQ/FASTA) using BioPython pipelines.

Next Steps

Solve advanced Rosalind problems (e.g. Genome Assembly, Dynamic Programming, Phylogeny)

Explore algorithm implementations from Bioinformatics Algorithms by Compeau & Pevzner

Connect solutions to real genomic datasets

About

This Repository comprises Python Codes for tasks and assignments from courses on genomic algorithm, from efficient codes for naiive exact matching to randomized motif search algorithm and so on.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published