Algorithms of DNA Sequencing in Python (Ongoing project)

This repository contains a collection of scripts, functions, and exercises developed during my progress through the Genomic Data Science Specialization and the Bioinformatics I Honours Track Certification. Furthermore, my solutions to homework from the rosalind bioinformatics platform has also been uploaded. The goal of this work is to build practical proficiency in implementing and debugging genomic algorithms used in computational biology.

Aims

To efficiently apply fundamental mathematical concepts including combinatorics, set theory and probability to tackle problems in bioinformatics.

To gain hands-on experience with core algorithmic techniques in bioinformatics, including sequence analysis, pattern matching, and motif discovery, as part of a structured genomics specialization program.

Skills and Algorithms Covered

GC content and parsing FASTA

Transcription and reverse complement

Hamming Distance computation for DNA sequence comparison

k-mer Clump Finding to locate high-frequency regions (e.g., origins of replication)

Boyer-Moore Pattern Matching Algorithm for efficient string search

Motif Detection using the MEME Suite

Use of BioPython for parsing and analyzing biological data

Learning Highlights

Strengthening Python skills through string manipulation, pattern matching, and data parsing

Understanding biological concepts like transcription, translation, and motifs

Practicing algorithm design with real biological datasets

Maintaining version control discipline by tracking every step in GitHub

Coursework & Certification

Attained the Spidey achievement on the Rosalind bioinformatics platform by solving 64 (2^6) problems

Completed 10 coding-intensive modules in the Genomic Data Science track

Earned the Bioinformatics I and II Honors Track Certificates, demonstrating mastery of both theory and applied coding tasks

Tools & Libraries

Python (3.10+)

BioPython

MEME Suite

SPAdes

QUAST

Jupyter Notebook (for stepwise development)

Completed Steps

Extend repository with dynamic programming algorithms (e.g., global/local alignment)

Implement basic genome assembly techniques (e.g., De Bruijn graphs)

Apply algorithms to real sequencing datasets (FASTQ/FASTA) using BioPython pipelines.

Next Steps

Solve advanced Rosalind problems (e.g. Genome Assembly, Dynamic Programming, Phylogeny)

Explore algorithm implementations from Bioinformatics Algorithms by Compeau & Pevzner

Connect solutions to real genomic datasets

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
BoyerMooreRedo.ipynb		BoyerMooreRedo.ipynb
Fundamentals_Refresh.ipynb		Fundamentals_Refresh.ipynb
GenomeAssembly.ipynb		GenomeAssembly.ipynb
GitHelp.ipynb		GitHelp.ipynb
NaivematchingHomework.ipynb		NaivematchingHomework.ipynb
PatternCount.ipynb		PatternCount.ipynb
README.md		README.md
RandomizedMotif.ipynb		RandomizedMotif.ipynb
Rosalindbioinformatics.ipynb		Rosalindbioinformatics.ipynb
aligned_vcp.bam		aligned_vcp.bam
aligned_vcp.snps.vcf		aligned_vcp.snps.vcf
aligned_vcp.sorted.bam		aligned_vcp.sorted.bam
aligned_vcp.sorted.bam.bai		aligned_vcp.sorted.bam.bai
aligned_vcp.vcf		aligned_vcp.vcf
reads_vcp.fastq		reads_vcp.fastq
vc_project.log		vc_project.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Algorithms of DNA Sequencing in Python (Ongoing project)

Aims

Skills and Algorithms Covered

Learning Highlights

Coursework & Certification

Tools & Libraries

Completed Steps

Next Steps

About

Uh oh!

Releases

Packages

Languages

YinkaAdu/Learning_Genomic_Algorithms

Folders and files

Latest commit

History

Repository files navigation

Algorithms of DNA Sequencing in Python (Ongoing project)

Aims

Skills and Algorithms Covered

Learning Highlights

Coursework & Certification

Tools & Libraries

Completed Steps

Next Steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages